git

Using grep and its alternatives for source code (ack/ag/git-grep/cgrep/sgrep/jq/xgrep) and fuzzy searches (agrep/tre)

grep@man print lines matching a pattern. In addition, two variant programs egrep and fgrep are available. egrep is the same as grep -E. fgrep is the same as grep -F.

grep [OPTIONS] PATTERN [FILE...]

# matching control
'-E,-F,-G,-P' interpret PATTERN as extended regexp, fixed string, basic regexp (default) or perl regexp
'-i/--ignore-case' case insensitive
'-v/--invert-match'
'-w/--word-regexp' select only those lines containing matches that form whole words

# output control
'-c/--count' output only match count
'-l/--files-with-matches' output only file names
'-m/--max-count=NUM' stop at num matches
'-q/--quiet/--silent' dont write any output, exit immediately with zero status if any match is found
'--color[=always|never|auto]' surround the matched string in color

# output prefix
'-H/--with-filename' output file name for each match
'-n/--line-number' output match line number
'-A/--after-context=NUM','-B/--before-context=NUM','-C/--context=NUM' output 'NUM' lines before/after/around

# file selection
'--exclude=GLOB','--include=GLOB' exclude/include-only files whose base name matches GLOB
'-R/-r/--recursive' read files recursively

# regexp howto
'.' matches any single char
'[]' matches list of chars, eg: [:alnum:],[:digit:], '^[]' matches any chars not in
'^','$' match at begining/end
'?' '*' '+' '{n}' '{n,}' '{,m}' '{n,m}' match quantifiers
'|' matches either regexp
'()' group regexps
in basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning and must be backslashed

from grep examples and howto use grep
see why GNU grep is fast

ack/ack@man is a faster (skips unecessary files) grep like perl script optimized for code search. Searches current directory and recursively by default, ignores meta directories (.git) and binaries and backups (~), prints line numbers, highlines matches in color, supports perl regexp.

# install
$ sudo apt-get install ack-grep | sudo yum install ack (EPEL)
$(deb) sudo dpkg-divert --local --divert /usr/bin/ack --rename --add /usr/bin/ack-grep

ack [options] PATTERN [FILE...]

# matching control
'-w/--word-regexp' force PATTERN to match only whole words
'-Q,/--literal' quote all metacharacters in PATTERN, it is treated as a literal.

# file selection
'--[no]ignore-dir=DIRNAME' ignore/dont ignore directory
'--type=[no]TYPE' specify the types of files to include or exclude from a search
'--type-set=[NAME]=.[ext],.[another-ext]' adds types
'--help-type' list types

# output control
'-A/--after-context=NUM','-B/--before-context=NUM','-C/--context=NUM' output 'NUM' lines before/after/around
'-c/--count' output only match count
'--group/--nogroup' groups matches by file name

from ack@xmodulo

ag is like ack but faster, ignores ‘.gitignore,.agignore’.

# install
$ sudo apt-get install silversearcher-ag | sudo yum install the_silver_searcher (EPEL) | cinst ag (windows/chocolatery)

git-grep same as ack/ag but only for git repos.

git grep [options] [<pathspec>...]

# file selection (defaults to working directory)
'--cached' searches blobs registered in the index file
'--no-index' searches files in the current directory that is not managed by Git
'--untracked' also searches in untracked files

# matching control
'-E,-F,-G,-P' interpret PATTERN as extended regexp, fixed string, basic regexp (default) or perl regexp
'-i/--ignore-case' ignores case
'--max-depth DEPTH' decent at most DEPTH directories
'-w/--word-regexp' match the pattern only at word boundary
'-v/--invert-match' select non-matching lines
'-e,--and,--or,--nor,()' specify how multiple patterns are combined using Boolean expressions

# output control
'-c/--count' print only line number
'--color[=always|auto|never]' show colored matches
'-h/-H' suppress file name match
'-n/--line-number' prefix line number
'-q/--quiet' dont write any output, exit immediately with zero status if any match is found
'-A/--after-context=NUM','-B/--before-context=NUM','-C/--context=NUM' output 'NUM' lines before/after/around
'-p/--show-function' show preceding line with function name
'-W/--function-context' showing the whole function in which the match was found

cgrep/cgrep@ubuntu context-aware grep for source codes. Another alternative to ack/ag.

# install
$ wget https://github.com/awgn/cgrep/releases | sudo apt-get install

cgrep [OPTIONS] [ITEM]

# context filters and semantic (generic)
'-c/--code' search in source code
'-m/--comment' search in comments
'-l/--literal' search in string literals
'-S/--semantic'"code" pattern: _, _1, _2... (identifiers), $, $1, $2... (optionals), ANY, KEY, STR, CHR, NUM, HEX, OCT, OR. 
e.g. "_1(_1 && $)" search for move constructors, "struct OR class _ { OR : OR <" search for a class declaration

# search for a variable
$ cgrep -r --identifier VARname

# search recursively for headers
$ cgrep -r --header "stdio.h"

# search for call (from any struct or pointer) to 'func' with '5' as 2nd argument
$ cgrep --code --semantic '_1 . OR -> func ( _2 , 5, _3 )' file.c

# show all lines containing "sort" but no "nest" in files with an extension .c, preceded by the name of the file
$ sgrep -o "%f:%r" '"n" _. "n" containing "sort" not containing "nest"' *.c

# show the beginning of conditional statements, consisting of "if" followed by a condition in parentheses, in files *.c
# ignore "if"s appearing within comments "/* ... */" or on compiler control lines beginning with '#':
$ sgrep '"if" not in ("/*" quote "*/" or ("n#" .. "n")) .. ("(" ..  ")")' *.c

from cgrep@github

sgrep grep for structured text files. The data model of sgrep is based on regions, which are non-empty substrings of text. Regions are typically occurrences of constant strings or meaningful text elements, which are recognizable through some delimiting strings.

# install
$ sudo apt-get install sgrep | $ sudo yum install sgrep (Olea)

# show all blocks delimited by braces
$ sgrep '"{" .. "}"' file.c
# show the outermost blocks that contain "sort" or "nest"
# sgrep 'outer("{" .. "}" containing ("sort" or "nest"))' file.c

from sgrep@man

jq@github command-line JSON processor in C (no extra dependencies).
You can use it to slice and filter and map and transform structured data, alternative to awk, sed and grep.

# install
$ sudo yum install jq (EPEL) | sudo apt-get install jq

$ cat json.txt
{"name": "Google", 
 "location": {"street": "1600 Amphitheatre Parkway","city": "Mountain View", "state": "California","country": "US"},
 "employees": [{"name": "Michael","division": "Engineering"},{"name": "Laura","division": "HR"},{"name": "Elise","division": "Marketing"}]
}

# parse object
$ cat json.txt | jq '.name' 
Google

# parse nested object
$ cat json.txt | jq '.location.city' 
Mountain View

# parse array
$ cat json.txt | jq '.employees[0].name'
"Michael"

# extract specific fields from object
$ cat json.txt | jq '.location | {street, city}' 
{"city": "Mountain View","street": "1600 Amphitheatre Parkway"}

from How to parse JSON string via command line on Linux and jq tutorial

xgrep@man search content of an XML file

# install
$ sudo yum install xgrep (EPEL) | sudo apt-get install xgrep

'-x xpath' xpath specification of the elements of interest
'-s string' string format in base-element:element/regex/,element/regex/,... where base-element is the name of the elements within which a match should be attempted, the match succeeding if, for each element/regex/ pair, the content of an element of that name is matched by the corresponding regex. If multiple -s flags are specified, a match by any one of them is returned.

# find all person elements with "Smith" in the content of the name element and "2000" in the content of the hiredate element
$ xgrep -s 'person:name/Smith/,hiredate/2000/' *.xml

agrep@wiki “approximate grep” is a proprietary fuzzy grep. TRE/agrep@man is a lightweight, robust, and efficient POSIX compliant regexp matching library with some exciting features such as approximate (fuzzy) matching.

# install
$ sudo apt-get install tre-agrep | sudo yum install agrep (EPEL)
$(deb) sudo dpkg-divert --local --divert /usr/bin/agrep --rename --add /usr/bin/tre-agrep

agrep [OPTION]... PATTERN [FILE]...

# regexp selection and interpretation
'-i/--ignore-case' ignore case distinctions
'-k/--literal' treat PATTERN as a literal string
'-w--word-regexp' force PATTERN to match only whole words
'-v/--invert-match' select non-matching records instead of matching records

# approximate matching settings
'-D/–delete-cost=NUM' set cost of missing characters to NUM
'-I/–insert-cost=NUM' set cost of extra characters to NUM
'-S/-–substitute-cost=NUM' set cost of incorrect characters to NUM
Note that a deletion (a missing character) and an insertion (an extra character) together constitute a substituted character, but the cost will be the that of a deletion and an insertion added together.
'-E/--max-errors=NUM' select records that have at most NUM errors.
'-#' select records that have at most # errors (# is a digit between 0 and 9)

# output control
'--color' show colored matches
'-c/--count' print only line number
'-s/--show-cost' print match cost
'-H/--with-filename' prefix with file name
'-l/--files-with-matches' only print file name

$ tre-agrep -5 -s -i resume example.txt
2:Résumé
1:Resümee
3:rèsümê
0:Resume
5:linuxaria

from How to do fuzzy search with tre-agrep

How to sign/verify a Git tags and commits (using GnuPG)

  • git-tag@man is used to create, list, delete or verify a tag object signed with gnupg.
# install gnupg
$ sudo apt-get install gnupg2 | sudo yum install gnupg2

# from git-tag
'-s/--sign' make a GPG-signed tag, using the default e-mail address’s key
'-u/--local-user=<key-id>' make a GPG-signed tag, using the given key (defaults to 'user.signingkey')
'-v/--verify' verify the gpg signature of the given tag names.

# create key pair, asks for your_email@address.com; note: use rng-tools to increase entropy
$ gpg --gen-key
$ gpg --list-secret-keys | grep ^sec
# either use '-u' or
$ git config --global user.signingkey [gpg-key-id]

# create a signed tab with private key
$ git tag --sign [signed-tag-name] -m "message"

# make public key available by storing as raw object and importing them
$ gpg --list-keys
$ gpg -a --export [gpg-key-id] | git hash-object -w --stdin
[object SHA]
# tag key with a name
$ git tag -a [object SHA] maintainer-pgp-pub
# import keys
$ git show maintainer-pgp-pub | gpg --import

# verify a tag signature
$ git tag --verify [signed-tag-name]

from Git Tools – Signing Your Work

  • git-commit@man record changes to the repository.
    As of 1.7.9 it’s possible to sign your commits with your private/secret key.
    As of 1.8.3 and later, “git merge” and “git pull” can be told to inspect and reject when merging a commit that does not carry a trusted GPG signature with the –verify-signatures command.
# from git-commit
'-S<keyid>/--gpg-sign=<keyid>' GPG-sign commit using the given key (defaults to 'user.signingkey')
# from git-log
'--show-signature' check the validity of a signed commit object by passing the signature to 'gpg --verify' and show the output
# from git-merge
'--verify-signatures' verify that the commits being merged have good and trusted GPG signatures and abort the merge in case they do not
'-S' sign the resulting merge commit itself

# sign commit
$ git config --global user.signingkey 8EE30EAB
$ git commit -m "message" -S

# show and verify signature in commit message
$ git log --show-signature 
gpg: Signature made ...
gpg: Good signature from ...

# verify and reject merge if has commits not signed
$ git merge --verify-signatures non-verify
fatal: Commit ab06180 does not have a GPG signature.

from Git Tools – Signing Your Work

How to add signed-off-by lines by amending Git commit messages (using git-interpret-trailers)

‘Signed-off-by:’ tag indicates that the signer was involved in the development of the patch, or that he/she was in the patch’s delivery path. Its simple line at the end of the explanation for the patch, which certifies that you wrote it or otherwise have the right to pass it on as a open-source patch.

‘Acked-by:, Cc:’ is used by the maintainer of the affected code when that maintainer neither contributed to nor forwarded the patch. If a person has had the opportunity to comment on a patch, but has not provided such comments, you may optionally add a “Cc:” tag to the patch.

If this patch fixes a problem reported by somebody else, consider adding a ‘Reported-by:’ tag to credit the reporter for their contribution. A ‘Tested-by:’ tag indicates that the patch has been successfully tested (in some environment) by the person named. ‘Reviewed-by:’, instead, indicates that the patch has been reviewed and found acceptable.

# amend commit with signed-off-by
$ git commit --amend --signoff
$ git log
...
Signed-off-by: Alice <alice@example.com>
# configure a 'sign' trailer with a 'Signed-off-by' key, and then add two of these trailers to a message
$ git config trailer.sign.key "Signed-off-by"
$ cat msg.txt
subject
message
$ cat msg.txt | git interpret-trailers --trailer 'sign: Alice <alice@example.com>' --trailer 'sign: Bob <bob@example.com>'
subject
message
Signed-off-by: Alice <alice@example.com>
Signed-off-by: Bob <bob@example.com>

# extract the last commit as a patch, and add a Cc and a Reviewed-by trailer to it
$ git format-patch -1
0001-foo.patch
$ git interpret-trailers --trailer 'Cc: Alice <alice@example.com>' --trailer 'Reviewed-by: Bob <bob@example.com>' 0001-foo.patch >0001-bar.patch

# configure a sign trailer with a command to automatically add a 'Signed-off-by: ' with the author information only if there is no 'Signed-off-by: ' already
$ git config trailer.sign.key "Signed-off-by: "
$ git config trailer.sign.ifmissing add
$ git config trailer.sign.ifexists doNothing
$ git config trailer.sign.command 'echo "$(git config user.name) <$(git config user.email)>"'
$ git interpret-trailers < EOF
Signed-off-by: Bob <bob@example.com>

How to version control /etc in Linux (using etckeeper)

It is a good idea to “version control” everything in /etc directory, so that you can track configuration changes, or recover from a previous configuration state if need be.

etckeeper is a collection of tools for versioning content, specifically in /etc directory. Uses existing revision control systems (e.g., git, bzr, mercurial, or darcs) to store version history in a corresponding backend repository. It integrates with package managers (e.g., apt, yum) to automatically commit any changes made to /etc directory during package installation, upgrade or removal. It tracks file metadata that revison control systems do not normally support, but that is important for /etc, such as the permissions of /etc/shadow.

# install w/ git the default
$ (el/centos) sudo yum install etckeeper git-core | (debian/ubuntu) sudo aptitude install etckeeper git-core
# or w/ bzr
$ (el/centos) sudo yum install etckeeper bzr | (debian/ubuntu) sudo aptitude install etckeeper etckeeper-bzr
$ cat /etc/etckeeper/etckeeper.conf
VCS="bzr"

# setup
$ etckeeper init
$ etckeeper commit "initial commit"

# now can use regular git/bzr commands to handle further changes, or etckeeper vcs
$ etckeeper vcs status | sudo git status
$ etckeeper vcs diff /etc | sudo git diff /etc
$ etckeeper commit "any comment" | sudo git commit -m "any comment"
$ etckeeper vcs log /etc/sysconfig/*
$ etckeeper vcs diff -r1..3
$ etckeeper vcs diff -c3
$ etckeeper vcs revert --revision 2 /etc

# automatic commit changes made to /etc as part of package installation or upgrade
$ yum install httpd
$ etckeeper vcs log | git log --summary -1
$ etckeeper vcs diff -c5

# manually commit changes made to /etc by other commands
$ passwd someuser
$ git status
$ git commit -a -m "changed a password"

# remove/ignore some files
$ git rm --cached printcap # modified by CUPS
$ echo printcap >> .gitignore
$ git commit -a -m "don't track printcap"

# checkout a different /etc branch
$ git checkout april_first_joke_etc
$ etckeeper init

# use clone to backup /etc to a remove server
$ ssh server 'mkdir /etc-clone; cd /etc-clone; chmod 700 .; git init --bare'
$ git remote add backup ssh://server/etc-clone
$ git push backup --all

# multiple machines, start with a etckeeper repository on one machine, then add another machine's etckeeper repository as a git remote and diff/merge them, dont checkout
$ git remote add dodo ssh://dodo/etc
$ git fetch dodo
$ git diff dodo/master group |head

from How to version control /etc directory in Linux