shell learning notes network operation example

From Chapter 5 of Linux Shell script introduction, a mess? No such thing!

Parsing website data

$ lynx -dump -nolist http://www.johntorres.net/BoxOfficefemaleList.html |grep -o "Rank-.*" | sed -e 's/ *Rank-\([0-9]*\) *\(.*\)/\1\t\2/' | sort -nk 1 > actresslist.txt

1   Keira Knightley
2   Natalie Portman
3   Monica Bellucci

The website is unresponsive. The above output is excerpted from the text

Image crawler and download tool

#!/bin/bash 
#Purpose: image download tool
#File name: img_downloader.sh
if [ $# -ne 3 ];then
  echo "Usage: $0 URL -d DIRECTORY"
  exit -1 
fi
while [ $# -gt 0 ];do
      case $1 in
      -d) shift; directory=$1; shift ;;
      *) url=$1; shift;;
      esac
done
# The case statement checks the first argument ($1). If it matches - d, the next parameter must be the directory, then move the parameter and save the directory name. Otherwise, it is the URL

mkdir -p $directory;
baseurl=$(echo $url | egrep -o "https?://[a-z.\-]+")
echo Downloading $url
# Egrep - O "< img SRC = [^ >] * >" only print < img > tags with attribute values
# Sed's / < img SRC = \ "\ ([^"] * \). * / \ 1 / g 'the url can be extracted from the string src="url"
# Sed "s, ^ /, $baseurl /," baseurl replaces the starting/
curl -s $url | egrep -o "<img[^>]*src=[^>]*>" | sed 's/<img[^>]*src=\"\([^"]*\).*/\1/g' | sed "s,^/,$baseurl/," > /tmp/$$.list
cd $directory;
while read filename;do
  echo Downloading $filename
  curl -s -O "$filename" --silent
done < /tmp/$$.list

The website is unresponsive. The above output is excerpted from the text

Web Album Generator

$ cat generate_album.sh 
#!/bin/bash 
#File name: generate_album.sh 
#Purpose: to create an album with pictures in the current directory

echo "Creating album.." 
mkdir -p thumbs 
# The script redirects this part up to EOF1 (excluding EOF1) to index.html
cat <<EOF1 > index.html 
<html> 
<head> 
<style>
body 
{ 
  width:470px; 
  margin:auto; 
  border: 1px dashed grey; 
  padding:10px; 
}
img 
{
  margin:5px; 
  border: 1px solid black;
} 
</style> 
</head> 
<body> 
<center><h1> #Album title </h1></center> <p> 
EOF1

for img in *.jpg; do 
  # An image thumbnail with a width of 100 pixels is created
  convert "$img" -resize "100x" "thumbs/$img"
  echo "<a href=\"$img\" >" >>index.html
  echo "<img src=\"thumbs/$img\" title=\"$img\" /></a>" >> index.html
done

cat <<EOF2 >> index.html
</p> 
</body> 
</html> 
EOF2

echo Album generated to index.html
$ ./generate_album.sh 
Creating album..
Album generated to index.html

Twitter command line client

#!/bin/bash 
#File name: twitter.sh 
#Purpose: twitter client Basic Edition

oauth_consumer_key=YOUR_CONSUMER_KEY 
oauth_consumer_secret=YOUR_CONSUMER_SECRET

config_file=~/.$oauth_consumer_key-$oauth_consumer_secret-rc

if [[ "$1" != "read" ]] && [[ "$1" != "tweet" ]];then 
	echo -e "Usage: $0 tweet status_message\n  OR\n  $0 read\n"
	exit -1; 
fi



# source /usr/local/bin/TwitterOAuth.sh 
# Use the source command to introduce the TwitterOAuth.sh library, so that you can use the defined functions to access Twitter. Function TO_init is responsible for initializing the library
source bash-oauth-master/TwitterOAuth.sh TO_init

if [ ! -e $config_file ]; then 
	# Obtain an OAuth token and token secret
	TO_access_token_helper
	if (( $? == 0 )); then
	echo oauth_token=${TO_ret[0]} > $config_file
	echo oauth_token_secret=${TO_ret[1]} >> $config_file 
	fi 
fi

source $config_file

if [[ "$1" = "read" ]];then 
	# Library function TO_statuses_home_timeline can get the published content from Twitter. The data returned by this function is a long string in JSON format
	# [{"created_at":"Thu Nov 10 14:45:20 +0000 "016","id":7...9,"id_str":"7...9","text":"Dining...
	TO_statuses_home_timeline '' 'YOUR_TWEET_NAME' '10'
	echo $TO_ret | sed 's/,"/\n/g' | sed 's/":/~/' | \ awk -F~ '{} {if ($1 == "text") {txt=$2;} else if ($1 == "screen_name") printf("From: %s\n Tweet: %s\n\n", $2, txt);} \ {}' | tr '"' ' '

elif [[ "$1" = "tweet" ]];then 
	shift TO_statuses_update '' "$@" echo 'Tweeted :)'

fi

Extract only

Query word meaning through Web server

#!/bin/bash 
#File name: define.sh 
#Purpose: used to obtain lexical meaning from dictionaryapi.com

key=YOUR_API_KEY_HERE

if [ $# -ne 2 ]; then
	echo -e "Usage: $0 WORD NUMBER"
	exit -1; 
fi

# nl add line number before line
curl --silent  http://www.dictionaryapi.com/api/v1/references/learners/xml/$1?key=$key | grep -o \<dt\>.*\</dt\> | sed 's$</*[a-z]*>$$g' |  head -n $2 | nl

$ ./define.sh usb 1 1 :a system for connecting a computer to another device (such as a printer, keyboard, or mouse) by using a special kind of cord a USB cable/port USB is an abbreviation of "Universal Serial Bus."How it works...

Only excerpts were made, and the operation was not verified

Find invalid links in Web site

#!/bin/bash 
#File name: find_broken.sh 
#Purpose: find invalid links in Web site

if [ $# -ne 1 ]; then
	echo -e "$Usage: $0 URL\n"
	exit 1; 
fi

echo Broken links:

mkdir /tmp/$$.lynx 
cd /tmp/$$.lynx

# lynx -traversal URL will generate multiple files in the current working directory, including reject.dat, which contains all links in the website
lynx -traversal $1 > /dev/null 
count=0;

# sort -u is used to create a list without duplicates
sort -u reject.dat > links.txt

while read link; do 
	# Verify the received response header with curl -I
	output=`curl -I $link -s  | grep -e "HTTP/.*OK" -e "HTTP/.*200"` 
	if [[ -z $output ]]; then 
		output=`curl -I $link -s | grep -e "HTTP/.*301"` 
		if [[ -z $output ]]; then 
			echo "BROKEN: $link" 
			let count++ 
		else echo "MOVED: $link" 
		fi
	fi 
done < links.txt 

[ $count -eq 0 ] && echo No broken links found.

$ ./find_broken.sh http://10.18.7.30
Broken links:
No broken links found.

Track website changes

#!/bin/bash 
#File name: change_track.sh 
#Purpose: tracking page changes

if [ $# -ne 1 ]; then
	echo -e "$Usage: $0 URL\n"
	exit 1; 
fi

first_time=0 # Non first run

# Use [! -e "last.html"]; Check whether you are running for the first time. If last.html does not exist, it means that this is the first run, and you must download the Web page and copy it as last.html
if [ ! -e "last.html" ]; then
	first_time=1
	# First run 
fi

curl --silent $1 -o recent.html

if [ $first_time -ne 1 ]; then 
	changes=$(diff -u last.html recent.html) 
	if [ -n "$changes" ]; then 
		echo -e "Changes:\n" 
		echo "$changes" 
	else 
		echo -e "\nWebsite has no changes" 
	fi 
else 
	echo "[First run] Archiving.."
fi

cp recent.html last.html

Only excerpts

Send Web page and read response

POST and GET are two request types of HTTP, which are used to send or retrieve information. In the GET request mode, we use the URL of the page to send parameters (name value). In the POST request mode, parameters are sent in the HTTP message body. POST is often used to submit forms with more content or private information

Here we use the sample website guestbook that comes with tclhttpd package. You can start from http://sourceforge.net/ projects/tclhttpd download tclhttpd and run it on the local system to create a local Web server. If the user clicks the Add me to your guestbook button, the page will send a request containing the name and URL, and the information in the request will be added to the guestbook page to show who has visited the site

Download the tclhttpd package and switch to the bin directory. Start tclhttpd daemon
tclsh httpd.tcl
Use curl to send a POST request and read the response of the website (HTML format)

$ curl URL -d "postvar=postdata2&postvar2=postdata2"
# perhaps
$ curl http://127.0.0.1:8015/guestbook/newguest.html -d "name=Clif&url=www.noucorp.com&http=www.noucorp.com"

<HTML> 
<Head> 
<title>Guestbook Registration Confirmed</title> </Head>
<Body BGCOLOR=white TEXT=black>
<a href="www.noucorp.com">www.noucorp.com</a>
<DL> <DT>Name <DD>Clif <DT>URL <DD> </DL> 
www.noucorp.com
</Body>

-d means submitting user data in POST mode- The string parameter form of d is similar to that of GET request. Each pair of var=value is separated by &

You can also use wget's – post data "string" to submit data

$ wget http://127.0.0.1:8015/guestbook/newguest.cgi --post-data "name=Clif&url=www.noucorp.com&http=www.noucorp.com" -O output.html

The format of name value is the same as that in cURL. The content in output.html is the same as that returned by the cURL command

Strings sent as POST (for ex amp le, - d or – POST date) should always be given as references. Otherwise, & will be interpreted by the shell as the command needs to be run as a background process

If you look at the source code of the website (using the View Source option of the web browser), you will find an HTML form similar to the following

<form action="newguest.cgi" " method="post" >
<ul>
<li> Name: <input type="text" name="name" size="40" >
<li> Url: <input type="text" name="url" size="40" >
<input type="submit" >
</ul> </form>

Where newguest.cgi is the target URL. When the user enters the details and clicks the Submit button, the name and URL are sent to the newguest.cgi page in the form of POST request, and then the response page is returned to the browser

Download video from the Internet

There is a video download tool called YouTube dl. Most distributions do not include this tool, and the version in the software warehouse may not be the latest, so it is best to download it from the official website( http://yt-dl.org ).

Follow the links and information on the page to download and install YouTube dl

youtube-dl https://www.youtube.com/watch?v=AJrsl3fHQ74

Use OTS to summarize text

Open Text Summarizer (OTS) can delete irrelevant content from text and generate a concise summary

Most Linux distributions do not contain the ots package and can be installed using the following command

apt-get install libots-devel

ots is easy to use. It reads the input from a file or stdin and outputs the generated summary to stdout

ots LongFile.txt | less
# perhaps
cat LongFile.txt | ots | less

ots can also combine curl to generate the summary information of the website. For example, you can use ots to summarize those nagging blogs

curl http://BlogSite.org | sed -r 's/<[^>]+>//g' | ots | less

Translate text on the command line

You can access Google's online translation services through your browser. Andrei Neculau wrote an awk script that can access the service from the command line and translate it

Most Linux distributions do not include this command-line translator, but you can install it directly from Git

cd ~/bin 
wget git.io/trans 
chmod 755 ./trans

trans can translate text into the language set by the locale environment variable

$> trans "J'adore Linux"

J'adore Linux

I love Linux

Translations of J'adore Linux French -> English

J'adore Linux I love Linux

You can use options before the text to be translated to control the language used for translation. The options are formatted as follows

from:to

To translate English into French, use the following command

$> trans en:fr "I love Linux" 
J'aime Linux

Posted by CavemanUK on Thu, 25 Nov 2021 19:58:18 -0800

Programmer Group