shell操作文件的几条命令:删除最后一列、删除第一行、diff等
shell操作文件的几条命令:删除最后一列、删除第一行、diff等
删除文件第一行: sed '1d' filename
删除文件最后一列: awk '{print $NF}' filename
awk删除重复行的命令:awk '{if (!seen[$0]++) {print $0;}}' filename比较文件的两种方法:
1)comm -3 --nocheck-order file1 file2
2) grep -v -f file1 file2 :输出file2中有file1中没有的行
当然还有diff file1 file2
贴一段昨天写的shell脚本~
#!/bin/=` +=` -d +=` +=` + /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/same_similiar_log/ haven
today_input=/home/crawler/petabyte/crawllog/news_data/=/home/crawler/petabyte/crawllog/news_data//opt/hadoop/program/bin/hadoop fs - $yesterday_input/ > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files//opt/hadoop/program/bin/hadoop fs - $today_input/ >> /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/ /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_get > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/ /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_get_without_first_line > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/- --nocheck-order /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_input /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/input_done > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/ -v -f /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/input_done /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_input > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/ /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/today_diff > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/ /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_input /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/=
= < /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/ $inputfile1
评论暂时关闭