diff options
Diffstat (limited to 'ult/ult_7/script2col.rst')
-rw-r--r-- | ult/ult_7/script2col.rst | 210 |
1 files changed, 210 insertions, 0 deletions
diff --git a/ult/ult_7/script2col.rst b/ult/ult_7/script2col.rst new file mode 100644 index 0000000..b15e85c --- /dev/null +++ b/ult/ult_7/script2col.rst @@ -0,0 +1,210 @@ +.. Objectives +.. ---------- + + .. At the end of this tutorial, you will be able to: + + .. 1. Sort lines of text files + .. 2. Print lines matching a pattern + .. 3. Translate or delete characters + .. 4. Omit repeated lines + + +.. Prerequisites +.. ------------- + +.. 1. Getting started with Linux +.. 2. Redirection and Piping + + + +Script +------ + + + ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Show the first slide containing title, name of the production | Hello friends and Welcome to the tutorial on 'Text Processing'. | +| team along with the logo of MHRD }}} | | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Show slide with objectives }}} | At the end of this tutorial, you will be able to, | +| | | +| | 1. Sort lines of text files | +| | #. Print lines matching a pattern | +| | #. Translate or delete characters | +| | #. Omit repeated lines. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Switch to the pre-requisite slide }}} | Before beginning this tutorial,we would suggest you to complete the | +| | former tutorials as being displayed currently. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Open the terminal }}} | In this tutorial, we shall learn about text processing. | +| :: | TO begin with, consider data kept in two files, namely marks1.txt and | +| | students.txt | +| cat marks1.txt | Let us see what data they contain. Open a terminal and type, | +| cat students.txt | | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | Let's say we wish to sort the output in the alphabetical order | +| | of the names of the files. We can use the ``sort`` command for this | +| cut -d " " -f 2- marks1.txt | paste -d " " students.txt -| sort | purpose. | +| | | +| | We just pipe the previous output to the ``sort`` command as, | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | Let's say we wish to sort the names, based on the marks in the first | +| | subject i.e. the first column after the name. ``sort`` command also allows us to | +| cut -d " " -f 2- marks1.txt | paste -d " " students.txt -| sort -t " " -k 2 | specify the delimiter between the fields and sort the data on a particular | +| | field. ``-t`` option is used to specify the delimiter and ``-k`` option | +| | is used to specify the field. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Show slide with, Sort... }}} | This command give us a sorted output as required. But, what if we would | +| | like the output to appear in the reverse order. ``-r`` option allows the output | +| | to be sorted in the reverse order and the ``-n`` option is used to choose | +| | a numerical sorting. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Switch to the terminal }}} | Let us do it on the terminal and see for ourselves, | +| :: | | +| | | +| cut -d " " -f 2- marks1.txt | paste -d " " students.txt -| | | +| sort -t " " -k 2 -rn | | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | Suppose, While you are compiling the student marklist, Anne walks up to you and | +| | wants to know her marks. You, being a kind person that you are, oblige. | +| cut -d " " -f 2- marks1.txt | paste -d " " students.txt - | grep Anne | But you do not wish to her to see the marks that others have scored. What | +| | do you do? Here, the ``grep`` command comes to your rescue. | +| | | +| | ``grep`` is a command line text search utility. You can use it to search | +| | for Anne and show her, what she scored. ``grep`` allows us to search for a | +| | search string in files. But we could, like any other command, pipe the | +| | output of other commands to it. So, we shall use the previous combination | +| | of cut and paste that we had, to get the marks of students along with their | +| | names and search for Anne in that. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | This will give us only the line containing the word Anne as the output. | +| | The grep command is by default case-sensitive. So, we wouldn't have got | +| cut -d " " -f 2- marks1.txt | paste -d " " students.txt - | grep -i Anne | the result if we had searched for anne, with a small a, instead of | +| | Anne, with a capital a. But, what if we didn't know, whether the name was | +| | capitalized or not? ``grep`` allows you to do case-insensitive searches | +| | by using the ``-i`` option. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | Now, in another scenario, if we wished to print all the lines, which do | +| | not contain the word Anne, we could use the ``-v`` option. | +| cut -d " " -f 2- marks1.txt | paste -d " " students.txt - | grep -iv Anne | | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Switch to the terminal }}} | grep allows us to do more complex searches, for instance, searching for | +| :: | sentences starting or ending with a particular pattern and regular | +| | expression based searches. | +| cat students.txt | tr a-z A-Z | | +| | {{{ Show slide with, tr }}} | +| | | +| | ``tr`` is a command that takes two sets of characters as parameters, and | +| | replaces occurrences of the characters in the first set with the | +| | corresponding elements from the other set. It reads from the standard | +| | output and writes to the standard output. | +| | | +| | For instance, if we wish to replace all the lower case letters in the | +| | students file with upper case, we can do it as, | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | A common task is to remove empty newlines from a file. The ``-s`` flag | +| | causes ``tr`` to compress sequences of identical adjacent characters in its | +| tr -s '\n' '\n' | output to a single token. For example, | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | Hit enter 2-3 times and see that every time we hit enter we get a newline. | +| | | +| <Enter> | | +| <Enter> | | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | It replaces sequences of one or more newline characters with a single newline. | +| | | +| cat foo.txt | tr -d '\r' > bar.txt | The ``-d`` flag causes ``tr`` to delete all tokens of the specified set of | +| | characters from its input. In this case, only a single character set | +| | argument is used. The following command removes carriage return characters, | +| | thereby converting a file in DOS/Windows format to the Unix format. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | The ``-c`` flag complements the first set of characters. | +| | | +| tr -cd '[:alnum:]' | | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | It therefore removes all non-alphanumeric characters. | +| | | +| cat items.txt | Let us consider one more scenario.Suppose we have a list of items, say books, | +| | and we wish to obtain a list which names of all the books only once, without | +| | any duplicates. To achieve this, we use the ``uniq`` command. Let us first | +| | have a look at our file | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | Now, let us try and get rid of the duplicate lines from this file using | +| | the ``uniq`` command. | +| uniq items.txt | | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | Nothing happens! Why? The ``uniq`` command removes duplicate lines only when | +| | they are next to each other. So, henceforth, we get a sorted file from the | +| sort items.txt | uniq | original file and work with that file. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | ``uniq -u`` command gives the lines which are unique and do not have any | +| | duplicates in the file. ``uniq -d`` outputs only those lines which | +| uniq -u items-sorted.txt | have duplicates. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| :: | The ``-c`` option displays the number of times each line occurs in the file. | +| | | +| uniq -dc items-sorted.txt | | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Show summary slide }}} | This brings us to the end of the end of this tutorial. | +| | In this tutorial, we have learnt to, | +| | | +| | 1. Use the ``sort`` command to sort lines of text files. | +| | #. Use the ``grep`` command to search text pattern. | +| | #. Use the ``tr`` command to translate and/or delete characters. | +| | #. Use the ``uniq`` command to omit repeated lines in a text. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Show self assessment questions slide }}} | Here are some self assessment questions for you to solve | +| | | +| | 1. To obtain patterns; one per line, which of the following command is used ? | +| | | +| | - grep -f | +| | - grep -i | +| | - grep -v | +| | - grep -e | +| | | +| | 2. Translate the word 'linux' to upper-case. | +| | | +| | 3. Sort the output of the ``ls -al`` command. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Solution of self assessment questions on slide }}} | And the answers, | +| | | +| | 1. In order to obtain patterns one per line, we use the ``grep`` command | +| | alongwith the -f option. | +| | | +| | 2. We use the tr command to change the word into uppercase | +| | :: | +| | | +| | echo 'linux' | tr a-z A-Z | +| | | +| | | +| | 3. We use the sort command as, | +| | :: | +| | | +| | ls -al | sort -n -k5 | +| | The -n means "sort numerically", and the -k5 option means to key off of | +| | column five. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Show the SDES & FOSSEE slide }}} | Software Development techniques for Engineers and Scientists - SDES, is an | +| | initiative by FOSSEE. For more information, please visit the given link. | +| | | +| | Free and Open-source Software for Science and Engineering Education - FOSSEE, is | +| | based at IIT Bombay which is funded by MHRD as part of National Mission on | +| | Education through ICT. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Show the ``About the Spoken Tutorial Project'' slide }}} | Watch the video available at the following link. It summarises the Spoken | +| | Tutorial project.If you do not have good bandwidth, you can download and | +| | watch it. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Show the `` Spoken Tutorial Workshops'' slide }}} | The Spoken Tutorial Project Team conducts workshops using spoken tutorials, | +| | gives certificates to those who pass an online test. | +| | | +| | For more details, contact contact@spoken-tutorial.org | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Show the ``Acknowledgements'' slide }}} | Spoken Tutorial Project is a part of the "Talk to a Teacher" project. | +| | It is supported by the National Mission on Education through ICT, MHRD, | +| | Government of India. More information on this mission is available at the | +| | given link. | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ +| {{{ Show the Thank you slide }}} | Hope you have enjoyed this tutorial and found it useful. | +| | Thank you! | ++----------------------------------------------------------------------------------+----------------------------------------------------------------------------------+ |