Quick Reference: awk (gawk), bash, cygwin, perl base64 decoding, etc

<< Home -or- Blog

Table of Contents:

GNU Utilities
enscript - convert source code to syntax colored HTML file: enscript --color -p output.html -Epython --language=html source.py
Unicode in Javascript
The Unicode representation of a character in Javascript is done using a "\u" and then four digits. For example, \u0040; represents the "at sign" where the 00 encodes for the charater set (in this case Basic Latin) and the 40 represents the "at sign" character in that character set. (This may not be 100% accurate since it's just a deduction from this unicode table)
Javascript unicode in action:
This is simply this code: alert('Unicode at sign: \u0040'); invoked using onclick.
ghostscript/GSView samples
$gs -sDEVICE=pdfwrite -dNOPAUSE -sOutputFile='output.pdf' -I/Resource/Font input.ps
sshd - http://pigtail.net/LRP/printsrv/cygwin-sshd.html
To get the exit status (aka return code|type) of a program 
to the bash shell just type: echo $?
Favorite Bash Shortcuts from "Linux in a Nutshell" 3rd Ed. (pp 455+)
Default Emacs Mode:
Ctrl+U/K  Clear from beginning/cursor to cursor/end
Ctrl+A/E  Go to beginning/end of line
Esc f/b   Go forward/back one word
Esc backspace/d  Delete word from cursor backward/forward

source file
read and execute file, file doesn't have to be executable
base64 encode / decode for email
#!/usr/bin/perl -w use MIME::Base64; # $outFile = '>test.gif'; open( DATA, $ARGV[0] ) or die "ERROR opening file!"; #open( OUTFILE, $outFile ) or die "ERROR creating file " . $outFile . "!"; while ( <DATA> ) { $decoded = decode_base64( $_ ); print $decoded; # print OUTFILE $decoded; # Use to write to OUTFILE # instead of STDOUT } close DATA;
Simple gcc/g++ Reference - [2006-01-22]
$ gcc -c (doesn't link - It's "compile, then link" not "link, then compile" Geeze!
$ gcc hello.cpp -o hello (Fails on OS X Tiger w/gcc 4.0 with the message:

	/usr/bin/ld: Undefined symbols:
But this works:
$ g++ hello.cpp -o hello
...and here's the Microsoft Way...
cl /EHsc /GR hello.cpp

/* Static checking and analysis will find potential problems that -Wall and -Wextra (-Wmost Apple-only) won't.
For Example:  (from: http://developer.apple.com/tools/xcode/staticanalysis.html) */

#include <stdio.h>

int addTen(int n);

int main(void) 
	float x = addTen(11.95f);
	printf("%f\n", x);
	return 0;

int addTen(int n) 
	return n + 10;

A modern hello world!:

#include <iostream>

using namespace std;

int main()
	cout << "Hello, World!" << endl;
	return 0;
$ perl -pe 's/[^0-9A-Z\n]//gi' dup_test | sort | uniq -id
$ perl -pe 's/[^0-9\n]//g' dup_test

** /g  for global is key because w/o it only one substitution per line is made **
** MS Word/HTML table parser may work now with this revelation **
$ perl -pe 's/ \D//g' dup_test | sort | uniq -c

      4 123
      1 444
      1 456
wget example(s)
wget -r -l 2 -k http://somesite.com
(-k converts downloaded files so their links are locally-usable)
(-r -l 2   causes recursion two levels deep)
ADDR/add \n: Use this on cmd line to parse out only field 1 where it begins with five 0's:
gawk '/^00000/ {print $1}' input.txt > output.txt
gawk '/^$/ {print \n $0}else{print $0}' input.txt > output.txt
gawk '/\$/{print "\n"$0} $0 !~ /\$/ {print $0}' input.txt > output.txt

Not Equal/Not Contains (:2):
gawk '$0!~/:2/{print $0}' INPUT_FILE.txt > OUTPUT_FILE.txt

$ gawk -F- '{print "<tr><td>" $2 "<td><a href=\"file://W:/some_dir/" $1 "/" $2 "/" substr($0, 1, 20) "\" target=\"pf\">" substr($3, 1, 6) "</a><td>" substr($3,12)}' OUTPUT_FILE_MORE_THAN_2_PGS.txt > HTML_w_more_than_2_pgs.html

 First do a:  ls > file_list:
ls -l | gawk '{print "update some_table set size = " $5 " where pk_file = \x027" substr($9, 1, 4)  "\x027 and some_customer = \x027CUST_ID\x027 ;\r"}'

Remove Blank lines from a text file :
	Use Cygwin gawk or maybe even Solaris awk?:
	gawk 'length($0) > 0  {print $0}' file_with_blanks > new_file_wo_blanks

Renaming Files/Padding Zeroes:

Output to batch file to check output and then run from cmd:
ls -1 | gawk 'length($0) == 8 {print "ren " $0 " 0" $0 "\r"}' > rename.bat
Run immediately:
ls -1 | gawk 'length($0) == 8 {system("mv " $0 " 0" $0)}'

gawk -F{ '{print $8}' file > file_gawked.txt
(field separator is "{")

gawk 'FIELDWIDTHS= 35 {print $1}'  file.txt
To add ' aphostrophe's/single quotes to use in IN ('SQL') lists:
gawk '{print "\x27" $0 "\x27," "\r"}' in.txt > out.txt
	-OR- use TextPad macro.
Rename files in a directory (cut off the first 38 chars in the name):
ls -1 | gawk '{print "mv \x22" $0 "\x22 \x22" substr($0, 38) "\x22"}' > mvIT
	NOTES: \x22 = hex code for "

Analyize Netgear Home Router Logs:
gawk -F] '/ {print $1}' 0*.log > all_logs.txt
gawk -F. ' {print gensub("\\[ALLOW:",x,1,$(NF - 1)) }' all_logs.txt | sort | uniq -c | sort > all_logs_final.txt
gawk '{if ($0 !~ /^\[ALLOW/) print $0}' 0*.log | sort | uniq | less
sed command(s)
Delete lines 1 through 24 from file.txt: sed '1,24d' file.txt
Microsoft Windows analogs to Unix Commands
Windows Unix Description
makecab gzip, bzip2, compress file compression
cls clear clear terminal window
ntsd -pn "someProgram.exe" -c "q" or ntsd -p PID -c "q" or PsKill or taskkill (WinXP Pro) kill [-9]PID end a process
Web Designer Essential Bookmarks & Sample Code
Link to a slashdot - xargs/recursion/"md5deep" post
	clamscan -i -v filename
	  runs the Clam Antivirus Scanner (clamav) from the commandline.
	  It found viruses in Mozilla Mail files which AVG Antivirus missed:
		  Trojan.Dropper.JS.Zerolin-6 and HTML.Phishing.Bank-1
	  update clamav using "freshclam"
iconv - convert file encodings (e.g. ISO-8859-1 (Latin) to UTF-8)
	(or try file -i filename
	 or konwert (debian pkg))
od 	- octal (and other types) dump
split 	- split large files, use cat to put them back together
	$ cat hug* > another-hug.jpg
	use diff to make sure there's no difference

swig - the Simplified Wrapper and Interface Generator, is a tool for
   easing the interfacing of C/C++ libraries to scripting languages.

Profiler in cygwin -- in bintools? - gprof

How can I find out which dlls are needed by an executable?
  `objdump -p' provides this information, but is rather verbose.
  `cygcheck' will do this much more concisely, and operates recursively, 
  provided the command is in your path.

From: cygwin.com/bugs.html  -
	"Another common problme is attempting to modify the contents of a C "string".
	On Cygwin (and many UNIXes) strings are stored in read-only memory.  So, it is
	not possible to modify them.  You can change this behavior with the gcc option
	-fwritable-strings but we suggest that it is better to change your program.

[2003-02] From Cygwin mailing list: Parsing XML to HTML:
		from a script...

		xsltproc --output /tmp/db2html.html styesheet.xsl "$@"

	Typing 'export' w/ no var name prints out all the vars

			 gdb does not have a GUI - type 'insight' instead

	Upgraded by running latest Cygwin 1.5.7 (?) - a big change
		from the previous verion I had installed (1.3.19?) since 64-bit
		file I/O is implemented, also got source of zip (PKZIP compatible)
		(downloaded from aka squid.nas.nasa.gov)
		ELFIO - make Linux ELF binary formatted files
			clisp: ANSI Common Lisp
			hexedit: hex editor
			libusb-win32: USB Programming Library
			libgc: Boehm-Demers-Weiser conservative gc for C/C++
			popt: library for parsing cmdline parameters			
			python: interactive OO language
		nasm 0.98.38-1: The Netwide Assembler (to compile Mozilla?)
		(not installed) 
		pdksh 5.2.14-3

			perl 5.8.2-1 and perl-libwin32: Perl extensions for using Win32 API
			postgresql 7.4.1-3
			speex/libspeex1/speex-devel: OpenSource, patent-free speech codec
			ruby: interpreted, OO language
			rpm, dpkg
			tcm: Toolkit for Conceptual Modeling (TCM)			
			WordNet: online lexical reference system
			xdelta/libxdelta2: computes changes between binary files (runtime)			
			ctetris: console version of Tetris
			aalib/libaa1: ASCII Art Library / runtime
			exif/libexif: display EXIF info on the commandline 
			jasper: JPEG-2000 library
			libjpeg62/libjpeg6b: manipulate JPEG files
			libpng/10/12: manipulate PNG files
			libwmf: reads vector image in Windows Metafile Format (WMF)
			xgraph: Xgraph			
			bc/gmp - arbitrary precision calculator/library - can show many, many digits
				after the decimal place of e, pi, etc:
				start bc with the -l option to preload the math library.
				scale = ??  (where ?? is the limit in C of a 32-bit integer?)
				e (1)       (for natural e)				
			singular - is a Computer Algebra System for polynomial 
				computations with special emphasis on the needs of 
				commutative algebra, algebraic
				geometry, and singularity theory.
			inetutils 1.3.2-25: Common networking utilities and servers
			xinetd: The extended Internet services daemon
			cygstart iexplore www.google.com launch windows programs
		rebase 2.2-3
		sharutils 4.2.1-3: The GNU shar utilities including uuencode/uudecode
		upx: a free, portable, extendable, high-perf. executable packager
		wtf: Translates acronyms and filename suffixes
		catdoc has been added to the cygwin distribution.
			It features 4 binaries, catdoc, xls2csv, catppt and
			wordview to list the content of MS-Word, Excel and Powerpoint files.
Updated 2007-01-05
©2006-2011 JustWidgets.com. All Rights Reserved.