Shell积累

提取文件名：

使用： **${var##*/}**，该命令的作用是去掉变量var从左边算起的最后一个’/‘字符及其左边的内容，返回从左边算起的最后一个’/‘（不含该字符）的右边的内容

假定file=/dir1/dir2/dir3/my.file.txt

操作	含义	实例
#	表示从左边算起第一个	*${file#/}：删掉第一个/ 及其左边的字符串：dir1/dir2/dir3/my.file.txt ${file#.}*：删掉第一个. 及其左边的字符串：file.txt
##	表示从左边算起最后一个	*${file##/}：删掉最后一个/ 及其左边的字符串：my.file.txt ${file##.}*：删掉最后一个. 及其左边的字符串：txt
%	右边算起第一个	*${file%/}：删掉最后一个 / 及其右边的字符串：/dir1/dir2/dir3 ${file%.}*：删掉最后一个 . 及其右边的字符串：/dir1/dir2/dir3/my.file
%%	右边算起最后一个	*${file%%/}：删掉第一个/ 及其右边的字符串：(空值) ${file%%.}*：删掉第一个 . 及其右边的字符串：/dir1/dir2/dir3/my
＊	任意的意思	见上面例子

例：${var%%x*}表示找出从右边算起最后一个字符x，并删除字符x及其右边的字符

bash -x xxx.sh可以debug该脚本，看到执行的完整流程
export LANG=C

含义就是：“LANG”是一个环境变量，指示了语言和编码方式，设置成C意思就是遵循ISO / ANSI C language specification.
CURDIR: CURDIR是make的builtin variable,表示当前目录
PWD：PWD是当前目录，注意PWD为全大写

其中的1:2的意思就是$1的第3个字符(下标为2)，如果传入cmd -j4，此时的nrjobs就为4

#!/bin/bash
#read choice
case "$1" in
--h)
 echo "your choice is ${choice}"
 ;;
-j*)
 if [ -n "${1:2}" ]; then
     nrjobs="${1:2}"
     echo "nrjob nums is ${nrjobs}"
 else
     nrjobs="$1"
     echo "nrjobs is ${nrjobs}"
 fi
 ;;
*)
 echo "invalid choice"
 ;;
esac

2>&1的意思就是标准错误和标准输出都重定向到文件里
find ./ -type f -name '*.sh' | tee -a ./record_for_sh2.txt:从当前目录下查找所有的后缀名为.sh的文件，并且把查找的结果追加到文件里
cd -：返回上一次的工作目录，比如存在这样一个目录/home/dir1/dir2/，当前我们在dir1目录下，然后cd dir2进入dir2目录，之后我们键入cd -就可以切换到dir1目录下了。如果我们在dir1目录下，我们执行cd /home/hlc进入hlc目录，再执行cd -又回到dir1目录了，非常方便
shell脚本里的kill 0作用是：If sig is 0, then no signal is sent, but error checking is still performed.
在使用shell的时候，shell会自动产生两个变量：OPTIND和OPTARG
- OPTIND初始值为1，其含义是下一个待处理的参数的索引。只要存在，getopts命令返回true，所以一般getopts命令使用while循环；
- OPTARG是当getopts获取到其期望的参数后存入的位置。而如果不在其期望内，则$optname被设为?并将该意外值存入OPTARG；如果$optname需要拥有具体设置值而实际却没有，则$optname被设为:并将丢失设置值的optname存入OPTARG；
sed ‘s/=.*$//g’作用就是：查找输入文本里的指定字段，并全局替换，替换为空
awk的工作原理：

sed_test.sh内容是：root:x:0:0:root:/root:/bin/bash

awk -F: '{print $1,$3}' sed_test.sh
1. awk使用一行作为输入，并将这一行赋给内部变量$0，每一行也可称为一个记录，以换行符(RS)结束
2. 每行被间隔符**==:==**(默认为空格或制表符)分解成字段(或域)，每个字段存储在已编号的变量中，从$1开始
  
  问：awk如何知道用空格来分隔字段的呢？
  
  答：因为有一个内部变量==FS==来确定字段分隔符。初始时，FS赋为空格
3. awk使用print函数打印字段，打印出来的字段会以==空格分隔==，因为$1,$3之间有一个逗号。逗号比较特殊，它映射为另一个内部变量，称为==输出字段分隔符==OFS，OFS默认为空格
4. awk处理完一行后，将从文件中获取另一行，并将其存储在$0中，覆盖原来的内容，然后将新的字符串分隔成字段并进行处理。该过程将持续到所有行处理完毕
$0:代表一整行

$1…$n:代表以FS定义的分割符分割的每个位域：即$1就为root，$2就为x……

对于逗号的作用：$1,$3之间逗号的作用是映射成另一个数组字段分隔符，默认情况下OFS为空格，那这样的话，awk -F: ‘{print $2 $3}’ sed_test.sh结果可以看出是x0，而awk -F: ‘{print $2,$3}’的结果是x 0，x和0中间有一个空格，这个空格就是OFS定义的
head与tail常见用法
head和 tail常用于显示文件前某一部分内容，首先说head的作用

head语法格式为：head [OPTION]… [FILE]…

用法：
- 使用head查看文件前10行内容：head -n 10 1.txt
- 使用head -n搭配-显示特定内容：head -n -10 1.txt 打印所有内容除了最后的10行()，shell里的head和 tail用法
- head -v 1.txt:打印的时候显示文件名
tail的语法格式：tail [OPTION]… [FILE]…
- tail -n 3 1.txt：打印1.txt文件最后三行内容
- tail -n +3 1.txt: 从1.txt的第三行开始输出1.txt的所有内容
- tail -v 1.txt：打印文件名称信息
  
  重要：tail -f :
  
  描述：从一个内容不断增加的文件中读取数据。新增加的内容部民被添加到文件的尾部，因此当新内容被写入文件的时候，可以用tail将其显示出来。只是简单的使用tail的话，它会读取文件的最后10行，然后退出，这样就不能做到实时监控，加入-f参数就可以做到实时监控文件的更新内容了
  1
  2
  3
  -f, --follow[={name|descriptor}]
  output appended data as the file grows;
  an absent option argument means 'descriptor'
if-else语句的condition部分：记得比较条件需要放在[] 中，前后要留空白

shell scripts里有用的一些命令

命令列表：

wc:-

统计文件里有多少字符和行

wc -l :print the newline counts

wc -m: print the character counts

实例：wc -l -m cal_sum.sh | awk ‘{print “the line:”$1”, characters:”$2” \n”}：统计cal_sum.sh里有多少行和多少字符

grep -nR “keyword” | wc -l:grep首先得到包含关键字的行，然后统计有多少行

grep:-

grep Kwan /etc/passwd`' prints all lines in `/etc/passwd` that include Kwan`’. The return value indicates whether any such lines have been found.

sed:-

a stream editor. Doing ls | sed s/a/o/g will produce a listing where all the ‘a’ characters become ‘o’(‘g’ represent all the ‘a’ characters are going to be globally replaced by ‘o’). Numerous tricks and tutorials are available from the Handy one-liners for SED file.

basename:-

Strips the directory and (optionally) suffix from a filename.

tr:-

Translates one set of characters to another set. For example, try

1	echo "date" \| tr d l

The following does case conversion

1	echo LaTeX \| tr '[:upper:]' '[:lower:]'

You can also use it as follows, ``capturing” the output in a variable

1 2	old="LaTeX" new=$(echo $old \| tr '[:upper:]' '[:lower:]')

test:-

This program can be used to check the existence of a file and its permissions, or to compare numbers and strings. Note that if [ -f data3 ]`' is equivalent to if test -f data3'; both check for a file's existence, but the […]` construction is faster - it’s a builtin. Note that you need spaces around the square brackets. Some examples,

if [ $LOGNAME = tpl ] # Did the user log in as tpl? 
if [ $num -eq 6 ]     # Is the value of num 6?
if [ $(hostname) = tw000 ] # Does hostname produce `tw000'?
                           # i.e. is that the machine name?

sort:-

``ls -l | sort -nr +4' lists files sorted according to what's in column 4 of ls -l`’s output (their size, usually).

cut:-

Extracts characters or fields from lines. cut -f1 -d':' < /etc/passwd prints all the uid’s in the passwd file. The following code shows a way to extract parts of a filename.cut -f1：指定截取哪一列，配合-d使用，-d指定分隔符，cut为列截取

filename=/tmp/var/program.cc
#首先得到文件名，然后输出文件的前缀和后缀
b=$(basename $filename)
prefix=$(echo $b | cut -d. -f 1)
suffix=$(echo $b | cut -d. -f 2)

find:-

finds files in and below directories according to their name, age, size, etc.

expr在进行算数运算时，要注意的是操作数和操作符之间有一个空格！！！例如：echo expr $a + $b，并且乘法运算符*需要使用“\”进行转义
echo -e表示使能转义

若字符串中出现以下字符，则特别加以处理，而不会将它当成一般文字输出：
\a 发出警告声；
\b 删除前一个字符；
\c 最后不加上换行符号；
\f 换行但光标仍旧停留在原来的位置；
\n 换行且光标移至行首；
\r 光标移至行首，但不换行；
\t 插入tab；
\v 与\f相同；
\ 插入\字符；
\nnn 插入nnn（八进制）所代表的ASCII字符

shell里可以使用bc命令来进行计算
使用方法：

usage: bc [options] [file …]
-h, –help: 帮助.
-i, –interactive: 交互模式.
-l, –mathlib: 预置数学程序.
-q, –quiet: 安静模式.
-s, –standard: 标准bc结构输入.
-w, –warn: 非标准结构给出警告.
-v, –version: 版本号.

#!/bin/bash

#输入5+50*3/20 + (19*2)/7
#然后使用tr去除空格，之后再使用bc命令进行计算
read line
no_white_space="$(echo -e "${line}" | tr -d '[:space:]')"
result=$(echo "$no_white_space" | bc -l)
result_rounded=`printf "%.3f" $result`

echo "${result_rounded}"

expr进行算数运算精度可能比较低，这个时候可以考虑先将结果echo然后bc -l处理，最后再输出
readonly定义一个只读变量
得到一行的第x个字符到最后一个字符可以用：cut -c 13-${NF}，意思就是得到一行的第13个字符到最后一个字符
cut默认是以TAB为分隔符，cut获取第二列到最后一列可以使用：cut -f 2-,cut获取第一列到第二列可以使用：cut -f -2
取出一个文件里的前20个字符：tr '\n' '\t' | cut -c -20 | head -1 | tr '\t' '\n'
取出某一个文件里的第12行到22行的内容，head -n 22先取出前22行，然后tail -n +12的意思就是从从输入文件的第12行输出输入文件的所有内容
1
head -n 22 | tail -n +12

获取文件的最后一行内容

1.awk 'END {print}'
 
2.sed -n '$p'
 
3.sed '$!N;$!D'
 
4.awk '{b=a"\n"$0;a=$0}END{print b}'

grep同时满足多个关键字

#同时满足word1,2,3才会匹配
grep -E "word1|word2|word3" file.txt

#实例：svn上列出configs,package,target目录下的修改
svn st | grep -E "configs|package|target"

使用tr删除文件中的所有小写字母
1
tr -d '[:lower:]'
使用tr把一个序列里多个空格用一个空格代替

tr -s:删除所有重复出现字符序列，只保留第一个
1
tr -s '[:blank:]'

sort工具用于排序;它将文件的每一行作为一个单位，从首字符向后，依次按ASCII码值进行比较，最后将他们按升序输出

-u ：去除重复行
-r ：降序排列，默认是升序
-o : 将排序结果输出到文件中,类似重定向符号>
-n ：以数字排序，默认是按字符排序
-t ：分隔符
-k ：第N列
-b ：忽略前导空格。
-R ：随机排序，每次运行的结果均不同

shell里的tab可以用：$’\t’来表示
给定一个文本，文本的各个序列以TAB分隔，使用sort分类指定列：

-n:以数字排序

-r:降序排序

-t:指定分隔符

-kn:第N列
1
sort -nr -t$'\t' -k2
单引号’ ‘

GNU bash

将字符括在单引号 (‘’’) 中会保留引号内每个字符的字面值。单引号之间不能出现单引号，即使前面有反斜杠

uniq命令

Usage: uniq [OPTION]… [INPUT [OUTPUT]]

用于检查及删除文本文件中重复出现的行列，一般与 sort 命令结合使用

可检查文本文件中重复出现的行列

选项：

-i:忽略大小写
-c: 在每列旁边显示该行重复出现的次数。
-d: 仅显示重复出现的行列，显示一行。
-D: 显示所有重复出现的行列，有几行显示几行。
-u: 仅显示出一次的行列
-i: 忽略大小写字符的不同
-f Fields: 忽略比较指定的列数。
-s N: 忽略比较前面的N个字符。-w N: 对每行第N个字符以后的内容不作比较。
[InFile]: 指定已排序好的文本文件。如果不指定此项，则从标准读取数据；
[OutFile]: 指定输出的文件。如果不指定此选项，则将内容显示到标准输出设备（显示终端）

paste命令

将两个文件合并

Usage: paste [OPTION]… [FILE]…

-d: 指定合并时以什么作为分隔符

-s: 将所有行以一个横行的方式输出

- - - :指定每n行作为一个横行输出，注意这里每个**-**之间都有空格！！！！！！

实例：每三行作为一个横行输出：paste - - -
slice an array分离一个数组，显示position 3-7的元素，并且每个元素都是唯一的：head -n 8 | tail -n +4 | paste -s
使用**(())**可以进行算术运算，比如
1
2
3
4
echo $((1^2))
echo $((1+2))
echo $((1*2))
echo $((1/2))
需要注意的是使用(())来计算表达式时，乘号和除号不用转义
把一个数组转换成字符串的方法是：
1
2
arr=(1 2 3 4)
arr_str=${arr[*]}
双引号””的作用例子
1
2
3
echo $line | cut -f1-3

echo "$line" | cut -f1-3
When double quotes are used white space (tab space is a form of white space) is preserved. When double quotes aren’t used tabspace isn’t preserved. cut by default uses tabspace as a delimiter to extract the fields. If no tabspaces are present (and if delimiter option of cut hasn’t been modified) then cut wont work.
awk命令语句间用分号间隔

grep

Usage: grep [OPTION]… PATTERNS [FILE]…

Pattern selection and interpretation:
-E, –extended-regexp PATTERNS are extended regular expressions
-F, –fixed-strings PATTERNS are strings
-G, –basic-regexp PATTERNS are basic regular expressions
-P, –perl-regexp PATTERNS are Perl regular expressions
-e, –regexp=PATTERNS use PATTERNS for matching
-f, –file=FILE take PATTERNS from FILE
-i, –ignore-case ignore case distinctions
-w, –word-regexp match only whole words
-x, –line-regexp match only whole lines
-z, –null-data a data line ends in 0 byte, not newline

Miscellaneous:
-s, –no-messages suppress error messages
-v, –invert-match select non-matching lines
-V, –version display version information and exit
–help display this help text and exit

Output control:
-m, –max-count=NUM stop after NUM selected lines
-b, –byte-offset print the byte offset with output lines
-n, –line-number print line number with output lines
–line-buffered flush output on every line
-H, –with-filename print file name with output lines
-h, –no-filename suppress the file name prefix on output
–label=LABEL use LABEL as the standard input file name prefix
-o, –only-matching show only nonempty parts of lines that match
-q, –quiet, –silent suppress all normal output
–binary-files=TYPE assume that binary files are TYPE;
                     TYPE is 'binary', 'text', or 'without-match'
-a, –text equivalent to –binary-files=text
-I equivalent to –binary-files=without-match
-d, –directories=ACTION how to handle directories;
                     ACTION is 'read', 'recurse', or 'skip'
-D, –devices=ACTION how to handle devices, FIFOs and sockets;
                     ACTION is 'read' or 'skip'
-r, –recursive like –directories=recurse
-R, –dereference-recursive likewise, but follow all symlinks
–include=GLOB search only files that match GLOB (a file pattern)
–exclude=GLOB skip files and directories matching GLOB
–exclude-from=FILE skip files matching any file pattern from FILE
–exclude-dir=GLOB skip directories that match GLOB
-L, –files-without-match print only names of FILEs with no selected lines
-l, –files-with-matches print only names of FILEs with selected lines
-c, –count print only a count of selected lines per FILE
-T, –initial-tab make tabs line up (if needed)
-Z, –null print 0 byte after FILE name

Context control:
-B, –before-context=NUM print NUM lines of leading context
-A, –after-context=NUM print NUM lines of trailing context
-C, –context=NUM print NUM lines of output context
-NUM same as –context=NUM
–color[=WHEN],
–colour[=WHEN] use markers to highlight the matching strings;
                     WHEN is 'always', 'never', or 'auto'
-U, –binary do not strip CR characters at EOL (MSDOS/Windows)

常用的选项

#全词匹配
grep -w "keyword" file1

#匹配的时候忽略大小写
grep -i "keyword" file

#选择所有不匹配的行
#意思就是输出所有不包含keyword的行
grep -v "keyword"

#匹配多个关键字
#忽略大小写并且全字匹配file里的the或者those或者then或者that关键字
grep -Eiw 'th(e|ose|en|at)' file