programming

“People got OO all wrong, its not about classes or objects. The big idea is messaging” (from Allen Kay)

Just a gentle reminder that I took some pains at the last OOPSLA to try to remind everyone that Smalltalk is not only NOT its syntax or the class library, it is not even about classes. I’m sorry that I long ago coined the term “objects” for this topic because it gets many people to focus on the lesser idea.

The big idea is “messaging” — that is what the kernal of Smalltalk/Squeak is all about (and it’s something that was never quite completed in our
Xerox PARC phase).

from Kay, Allen. “prototypes vs classes was: Re: Sun’s HotSpot”

Kay is one of the fathers of the idea of object-oriented programming, inventor of Smalltalk.

Advertisements

How to write a Linux kernel module

Kernel modules are pieces of code that can be loaded and unloaded into the kernel upon demand. They extend the functionality of the kernel without the need to reboot the system.

## obtaining information
# list loaded modules
$ lsmod
# show module info
$ modinfo MODULENAME
# list dependencies
$ modprobe --show-depends MODULENAME

## automatic module load
# configure udev/systemd-modules to what modules to load at boot, see 'man modules-load.d'
$ vi {/etc,/run,/usr/lib}/modules-load.d/PROGRAM.conf
MODULENAME

## manual module load
# load by name
$ modprobe MODULENAME
# load by filename from '/lib/modules/$(uname -r)/'
$ insmod FILENAME [ARGS]
# unload module
$ modprobe -r MODULENAME
# same
$ rmmod MODULENAME

## passing parameters to module
# either from '/etc/modprobe.d'
$ vi /etc/modprobe.d/FILENAME.conf
options MODULENAME parametername=parametervalue
# or from kernel command line
MODULENAME.parametername=parametercontents

## blacklisting: prevent the kernel module from loading
# either from '/etc/modprobe.d'
$ vi /etc/modprobe.d/FILENAME.conf
blacklist MODULENAME
# or from kernel command line
modprobe.blacklist=modname1,modname2,modname3

from kernel modules@arch

You can write your own modules, see the linux kernel module programming guide.

# install build dependencies (kernel source)
$(deb) apt-get install build-essential linux-headers-$(uname -r)
$(el) yum install yum install gcc gcc-c++ make kernel-headers
$(arch) pacman -Syu base-devel linux-headers

# write a hello world module
$ vi hello.c
#include <linux/module.h> // all kernel modules
#include <linux/kernel.h> // KERN_EMERG, KERN_ALERT, KERN_CRIT, ... 
#include <linux/init.h>   // __init and __exit macros
MODULE_LICENSE("GPL");
MODULE_AUTHOR("You");
MODULE_DESCRIPTION("A Simple Hello World module");
static int __init hello_init(void) {
    printk(KERN_NOTICE "Hello world!n");
    return 0; // non-0 means init_module failed
}
static void __exit hello_cleanup(void) {
    printk(KERN_NOTICE "Cleaning up module.n");
}
module_init(hello_init);
module_exit(hello_cleanup);

$ Makefile
obj-m += hello.o
all:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

# testing
$ make ; sudo insmod hello.ko
$ dmesg|grep -i hello
$ sudo rmmod hello.ko

from how to write your own linux kernel module

How to setup a GNU/Linux-like environment in Windows (using cygwin, mingw, gow or msys2)

cygwin/cygwin@wiki is a Unix-like environment and command-line interface for Windows. Cygwin consists of two parts: a dynamic-link library (DLL) as an API compatibility layer providing a substantial part of the POSIX API functionality, and an extensive collection of software tools and applications that provide a Unix-like look and feel.

%comspec% cinst cygwin -y (or https://cygwin.com/install.html)

# open bash terminal
%comspec% %CYGWINPATH%/bin bash.exe --login -i
# open a terminal emulator
%comspec% %CYGWINPATH%/bin mintty.exe

from MinTTY Gives Cygwin a Native Windows Interface

mingw@fedora/mingw-w64 brings free software toolchains to Windows. It hosts a vibrant community which builds and debugs software for Windows while providing development environment for everyone to use.

$ vi hello.c
#include <stdio.h>
int main () { printf ("Hello world!n"); return 0; }

## build using 'gcc', dependent on 'cygwin1.dll' 3.2Mb
# open cygwinsetup.exe and install 'gcc' 
$ gcc hello.c -o hello-gcc.exe

## builds using 'mingw64', dependent on 'msvcrt.dll' / native
# open cygwinsetup.exe and install 'mingw64-x86_64' or 'mingw64-i686'
# note: http://www.delorie.com/howto/cygwin/mno-cygwin-howto.html
$ x86_64-w64-mingw32-gcc hello.c -o hello-mingw64.exe
# or ./configure --host=x86_64-w64-mingw32 ...

gow@github (Gnu On Windows) is the lightweight alternative to cygwin. It uses a convenient Windows installer that installs about 130 extremely useful open source UNIX applications compiled as native win32 binaries.

%comspec% cinst gow -y (or https://github.com/bmatzelle/gow/releases)
# note: it adds gowbin to PATH

# list available commands
%comspec% gow.bat -l

# execute bash shell script
%comspec% bash.exe script.sh [script options]

from gow@tuxdiary

msys2 (Minimal SYStem 2) is a fork of cygwin focus on Windows interop dropping the Posix, using MinGW-w64 toolchains. Also ported Arch’s Pacman for easy package management.

# see http://sourceforge.net/p/msys2/wiki/MSYS2%20installation/

# open a shell
%comspec% %MSYS64PATH%/msys2_shell.bat

# install new package
$ pacman -Suy PACKAGE
# search package
$ pacman -Ss PATTERN
# list packages installed
$ pacman -Q

# build using 'mingw64' or 'gcc', both depend in 'msys-2.0.dll' 3.2Mb
$ x86_64-pc-msys-gcc hello.c -o hello-mingw64.exe
$ pacman -Syu gcc
$ gcc hello.c -o hello-msys2.exe

from msys2@tuxdiary

How to build and package Erlang OTP applications (using rebar)

Rerbar is an Erlang build tool that makes it easy to compile and test Erlang applications, port drivers and releases. Its a self-contained binary.

Using rebar

Either install from distribution repo (if available), use prebuild binary, or compile from the source.

# requirements, see http://www.erlang.org/doc/installation_guide/INSTALL.html
$ sudo apt-get install erlang

# either from repo (not recomended, too old)
$ sudo apt-get install rebar
# or compile from source
...
# or pre-build binary (recomended)
$ wget https://github.com/rebar/rebar/wiki/rebar ; chmod +x rebar

from rebar@github

Use a template to create new application that follows OTP Design Principles.

$ mkdir rebar-helloworld ; cd rebar-helloworld
$ ../rebar create-app appid=app1
==> app1 (create-app)
Writing src/app1.app.src # Application descriptor
Writing src/app1_app.erl # Application callback module
Writing src/app1_sup.erl # Supervisor callback module

Next add a generic server to the application. A gen_server is the server implementation of a client-server. You have to fill in the pre-defined set of function in a callback module.

# either create file
$ cat src/app1_srv.erl
-module(app1_server).
-behaviour(gen_server).
-export([start_link/0, say_hello/0]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2,
         terminate/2, code_change/3]).
start_link() ->
    gen_server:start_link({local, ?MODULE}, ?MODULE, [], []).
init([]) ->
    {ok, []}.
say_hello() ->
    gen_server:call(?MODULE, hello).
%% callbacks
handle_call(hello, _From, State) ->
    io:format("Hello from server!~n", []),
    {reply, ok, State};
handle_call(_Request, _From, State) ->
    Reply = ok,
    {reply, Reply, State}.
handle_cast(_Msg, State) ->
    {noreply, State}.
handle_info(_Info, State) ->
    {noreply, State}.
terminate(_Reason, _State) ->
    ok.
code_change(_OldVsn, State, _Extra) ->
    {ok, State}.

# or use template
$ rebar create template=simplesrv srvid=app1_srv
==> app1 (create)
Writing src/app1_srv.erl

# and add say_hello/0 function
-export([start_link/0, say_hello/0, stop/0]).
%% API Function Definitions
say_hello() ->
    gen_server:call(?MODULE, hello).
stop() ->
    gen_server:cast(?MODULE, stop).
%% callbacks (gen_server Function Definitions)
handle_call(hello, _From, State) ->
    io:format("Hello from srv!~n", []),
    {reply, ok, State};
handle_call(_Request, _From, State) ->
    Reply = ok,
    {reply, Reply, State}.
handle_cast(stop, State) ->
    {stop, normal, State};

We could compile now, but lets create another application lib1 and compile both.

# move both apps to '/apps'
$ mkdir -p apps/{app1,lib1}
$ mv src apps/app1/
$ cd apps/lib1
$ rebar create-app appid=lib1
$ cat src/hello.erl
-module(hello).
-export([hello/0]).
hello() ->
    io:format("Hello for lib!", []).

# compile both
$ cd ../..
$ cat rebar.config
{sub_dirs, ["apps/app1", "apps/lib1"] }.
$ tree
.
├── apps
│   ├── app1
│   │   └── src
│   │       ├── app1_app.erl
│   │       ├── app1.app.src
│   │       ├── app1_srv.erl
│   │       └── app1_sup.erl
│   └── lib1
│       └── src
│           ├── hello.erl
│           ├── lib1_app.erl
│           ├── lib1.app.src
│           └── lib1_sup.erl
└── rebar.config
$ rebar compile
==> app1 (compile)
==> src (compile)
==> lib1 (compile)
Compiled src/hello.erl
==> src (compile)
==> app1 (compile)

# test (in development)
$ erl -pa apps/*/ebin
1> app1_srv:start_link().
{ok,}
2> app1_srv:say_hello().
Hello from server!
3> app1_srv:stop().
ok
4> hello:hello().
Hello from lib!
ok

How can I call a library function from lib1 in a app1 ?

$ cat apps/app1/src/app1_srv.erl
...
%% callbacks
handle_call(hello, _From, State) ->
    hello:hello(),
    io:format("~nHello from srv!~n", []),
    {reply, ok, State}.
$ cat apps/app1/src/app1_sup.erl
...
init([]) ->
    {ok, {{one_for_one, 1, 60}, [?CHILD(app1_srv, worker)]}}.

# recompile and test
$ rebar clean compile
$ erl -pa apps/*/ebin
1> app1_sup:start_link().
{ok,}
2> app1_srv:say_hello().
Hello for lib!Hello from server!
ok

To make the application work like a service (start, stop, console, …), create release. Its a complete system consisting of these applications and a subset of the Erlang/OTP applications. Including a way to make application work like a service (start, stop, console, …).

$ mkdir rel; cd rel
$ rebar create-node nodeid=app1
==> rel (create-node)
Writing reltool.config
Writing files/erl
Writing files/nodetool
Writing files/app1
Writing files/sys.config
Writing files/vm.args
Writing files/app1.cmd
Writing files/start_erl.cmd
Writing files/install_upgrade.escript
$ cd -

$ cat rel/reltool.config
{lib_dirs, ["../apps"]},

$ cat rebar.config
{sub_dirs, ["apps/app1", "apps/lib1", "rel"] }.

# compile and generate release
$ rebar clean compile generate
# if "ERROR: Unable to generate spec: read file info /usr/lib/erlang/man/man1/XPTO.gz" failed then "sudo rm /usr/lib/erlang/man/man1/XPTO.gz"
# ignore "WARN:  'generate' command does not apply to directory", see https://github.com/rebar/rebar/issues/253
$ ls rel/app1
bin  erts-6.1  etc  lib  log  releases

# test using console
$ ./rel/app1/bin/app1 console
# if "...'cannot load',elf_format,get_files}}" then add to 'reltool.config'
{app, hipe, [{incl_cond, exclude}]},
mysample@127.0.0.1)1> application:which_applications().
[{app1,[],"1"},
 {sasl,"SASL  CXC 138 11","2.4"},
 {stdlib,"ERTS  CXC 138 10","2.1"},
 {kernel,"ERTS  CXC 138 10","3.0.1"}]
(mysample@127.0.0.1)2> app1_srv:say_hello().
Hello for lib!Hello from server!
ok

# start and attach to get console
$ ./rel/app1/bin/app1 start ; ./rel/app1/bin/app1 attach

# deploy/export
$ tar -C rel -czvf app1.tar.gz app1

from release-handling@github

Rebar makes building Erlang releases easy. One of the advantages of using OTP releases is the ability to perform hot-code upgrades. To do this you need to build a upgrade package that contains the built modules and instructions telling OTP how to upgrade your application.

# generate first release
$ rebar clean compile generate
$ mv rel/app1 rel/app1_1.0

# change 'hello' function
$ cat apps/app1/src/app1_srv.erl
handle_call(hello, _From, State) ->
    hello:hello(),
    {_,{Hour,Min,Sec}} = erlang:localtime(),
    io:format("Hello from server at ~2w:~2..0w:~2..0w!~n", [Hour,Min,Sec]),
    {reply, ok, State};

# bump version and generate new release
$ cat rel/reltool.config
       {rel, "app1", "2",
$ cat ../apps/app1/src/app1.app.src
  {vsn, "2"},
$ rebar clean compile generate
$ tree rel -d -L 2
rel
├── app1
│   ├── bin
│   ├── erts-6.1
│   ├── lib
│   ├── log
│   └── releases
│       └── 1
├── app1_1.0
│   ├── bin
│   ├── erts-6.1
│   ├── lib
│   ├── log
│   └── releases
│       └── 2
└── files

In order to make an upgrade, you must have a valid .appup file. This tells the erlang release_handler how to upgrade and downgrade between specific versions of your application.

# generate '.appup' upgrade instructions
$ cd rel ; rebar generate-appups previous_release=app1_1.0
==> rel (generate-appups)
Generated appup for app1
Appup generation complete
# see './app1/lib/app1-2/ebin/app1.appup'
% appup generated for app1 by rebar ("2015/01/23 16:03:24")
{"2", [{"1", [{update,app1_srv,{advanced,[]},[]}]}], [{"1", []}]}.

# now create the upgrade package
$ cd rel ; rebar generate-upgrade previous_release=app1_1.0
==> rel (generate-upgrade)
app1_2 upgrade package created
$ ls rel
app1  app1_1.0  app1_2.tar.gz  files  reltool.config

# install upgrade using 'release_handler'
$ mv rel/app1_2.tar.gz rel/app1_1.0/releases
$ ./rel/app1_1.0/bin/app1 console
1> release_handler:which_releases().
[{"app1","1",[],permanent}]
2> release_handler:unpack_release("app1_2").
{ok,"2"}
3> release_handler:install_release("2").
{ok,"1",[]}
4> release_handler:make_permanent("2").
ok
5> app1_srv:say_hello().
Hello for lib!Hello from server at 16:15:24!
ok
6> release_handler:which_releases().
[{"app1","2",[],permanent},{"app1","1",[],old}]

# generating a v3
$ mv rel/app1 rel/app1_2.0
# make code change ... and compile/generate upgrade package
$ rebar clean compile generate
$ cd rel ; generate-appups previous_release=app1_2.0
$ rebar generate-upgrade previous_release=app1_2.0
$ ls
app1  app1_1.0  app1_2.0  app1_3.tar.gz  files  reltool.config

from upgrades@github and rebar tutorial.

Rebar can fetch and build projects including source code from external sources (git, hg, etc.). See erlang-libs.

$ cat rebar.config
deps, [
    {'erlcloud', ".*", { git, "https://github.com/gleber/erlcloud.git"}},
    {'lager', ".*", { git, "git://github.com/basho/lager.git"} }
]}.
$ rebar update-deps
$ cat src/app1.app.src
    {applications, [
        ..., erlcloud, lager
    ]},

from [https://github.com/rebar/rebar/wiki/Dependency-management](dependency management)

All code is available in erlang-otp-helloworld@github.

Using rebar3

Rebar3 is an experimental branch that tries to solve some issues, see annoucement.

$ wget https://s3.amazonaws.com/rebar3/rebar3 ; chmod +x rebar3

It comes with templates for creating applications, library applications (with no start/2), releases and plugins. Use the new command to create a project with from a template. It accepts lib, app, release and plugin as the first argument and the name for each as the second argument.

# create new application and release
$ rebar3 new release app1
===> Writing app1/apps/app1/src/app1_app.erl
===> Writing app1/apps/app1/src/app1_sup.erl
===> Writing app1/apps/app1/src/app1.app.src
===> Writing app1/rebar.config
===> Writing app1/config/sys.config
===> Writing app1/config/vm.args
===> Writing app1/.gitignore
===> Writing app1/LICENSE
===> Writing app1/README.md

# optionally add dependencies
$ cat rebar.config
{deps, [{cowboy, {git, "git://github.com/ninenines/cowboy.git", {tag, "1.0.1"}}}]}.

# add host to nodename otherwise you get "Can't set long node name" when starting console
$ cat config/vm.args
-name app1@127.0.0.1

# and release (uses relx instead of reltool)
$ rebar3 release
===> Resolved app1-0.1.0
===> Dev mode enabled, release will be symlinked
===> release successfully created!
# if "Missing beam file elf_format ... elf_format.beam" then 'sudo apt-get install erlang-base-hipe'

# test using console
$ ./_build/rel/app1/bin/app1-0.1.0 console
1> app1_srv:say_hello().
Hello from server!
ok

# deploy/export
$ REBAR_PROFILE=prod rebar3 tar
===> tarball .../_build/rel/app1/app1-0.1.0.tar.gz successfully created!

# upgrading
$ cat apps/app1/src/app1.app.src
,{vsn, "0.2.0"}
$ cat rebar.config
{relx, [{release, {'app1', "0.2.0"},
$ cat apps/app1/src/app1_srv.erl
    io:format("Hello from server v2!~n"),
$ mv _build/rel _build/rel_0.1.0
$ REBAR_PROFILE=prod ../rebar3 tar
===> tarball .../_build/rel/app1/app1-0.2.0.tar.gz successfully created!
$ cp _build/rel/app1/app1-0.2.0.tar.gz _build/rel_0.1.0/app1/releases/app1_0.2.0.tar.gz
$ _build/rel_0.1.0/app1/bin/app1 start
$ _build/rel_0.1.0/app1/bin/app1 upgrade 0.2.0
# you will get "noent ... reup" because '.appup' is missing, see https://github.com/rebar/rebar3/issues/57

from basic usage

How to do static code analysis in C/C++ (using sparse, splint, cpplint and clang)

Static program analysis is basically analysis looking at the source code without executing it (as opposed to dynamic analysis). Generally used to find bugs or ensure conformance to coding guidelines.

  • sparse@wiki/sparse@man is a static analysis tool that was initially designed to only flag constructs that were likely to be of interest to kernel developers, such as the mixing of pointers to user and kernel address spaces. cgcc@man is a perl-script compiler wrapper to run Sparse after compiling.
## install
$ sudo apt-get install sparse | $ sudo yum install sparse (EPEL)

## example
$ cat test.c
include <stdio.h>
int main(void) {
        int *p = 0;
        printf("Hello, Worldn");
        return 0;
}

$ cgcc -Wsparse-all -c test.c
(or make CC=cgcc)
test.c:4:18: warning: Using plain integer as NULL pointer

# if you get error: "unable to open 'sys/cdefs.h'" then
$ sudo ln -s /usr/include/x86_64-linux-gnu/sys /usr/include/sys
$ sudo ln -s /usr/include/x86_64-linux-gnu/bits /usr/include/bits
$ sudo ln -s /usr/include/x86_64-linux-gnu/gnu /usr/include/gnu
  • splint/splint@wiki/splint@man statically checking C programs for security vulnerabilities and coding mistakes. Formerly called LCLint, it is a modern version of the Unix lint tool. Project’s last update was November 2010.
## install
$ sudo apt-get install splint | $ sudo yum install splint (EPEL)

## example
$ cat test2.c
#include <stdio.h>
int main()
{
    char c;
    while (c != 'x');
    {
        c = getchar();
        if (c = 'x')
            return 0;
        switch (c) {
        case 'n':
        case 'r':
            printf("Newlinen");

    }
    return 0;
}

$ splint -hints test2.c
test2.c: (in function main)
test2.c:5:12: Variable c used before definition
test2.c:5:12: Suspected infinite loop.  No value used in loop test (c) is modified by test or loop body.
test2.c:7:9: Assignment of int to char: c = getchar()
test2.c:8:13: Test expression for if is assignment expression: c = 'x'
test2.c:8:13: Test expression for if not boolean, type char: c = 'x'
test2.c:18:1: Parse Error. (For help on parse errors, see splint -help parseerrors.)
*** Cannot continue.
## install
$ wget http://google-styleguide.googlecode.com/svn/trunk/cpplint/cpplint.py
$ chmod +x cpplint.py

## example
$ ./cpplint.py --extensions=c test2.c 
test2.c:0:  No copyright message found.  You should have a line: "Copyright [year] <Copyright Owner>"  [legal/copyright] [5]
test2.c:3:  { should almost always be at the end of the previous line  [whitespace/braces] [4]
test2.c:5:  Empty loop bodies should use {} or continue  [whitespace/empty_loop_body] [5]
test2.c:14:  Line ends in whitespace.  Consider deleting these extra spaces.  [whitespace/end_of_line] [4]
test2.c:14:  Redundant blank line at the end of a code block should be deleted.  [whitespace/blank_line] [3]
Done processing test2.c
Total errors found: 5
## install
$ sudo aptitude install clang | sudo yum install clang (EPEL)

## example
$ cat test3.c 
void test() {
  int x;
  x = 1; // warn
}

$ clang --analyze test3.c 
test3.c:3:3: warning: Value stored to 'x' is never read
  x = 1; // warn
  ^   ~
1 warning generated.

$ scan-build gcc -c test3.c 
scan-build: Using '/usr/lib/llvm-3.5/bin/clang' for static analysis
test3.c:3:3: warning: Value stored to 'x' is never read
  x = 1; // warn
  ^   ~
1 warning generated.
scan-build: 1 bug found.

Using semantic patching with Coccinelle (a patching tool that knows C)

Coccinelle is a program matching and transformation engine which provides the language SmPL (Semantic Patch Language) for specifying desired matches and transformations in C code.

## install see http://coccinelle.lip6.fr/download.php
$ sudo apt-get install coccinelle | sudo yum install coccinelle (from fedora rawhide)

## usage
spatch -sp_file <SP> <files> [-o <outfile> ] [-iso_file <iso> ] [ options ]

## examples
$ cat test.cocci
// Replaces calls to alloca by malloc and checks return value
@@
expression E;
identifier ptr;
@@
-ptr = alloca(E);
+ptr = malloc(E);
+if (ptr == NULL)
+        return 1;

$ cat test.c
#include <alloca.h>
int main(int argc, char *argv[]) {
    unsigned int bytes = 1024 * 1024;
    char *buf;
    /* allocate memory */
    buf = alloca(bytes);
    return 0;
}

$ spatch -sp_file test.cocci test.c
--- test.c
+++ /tmp/cocci-output-29896-40280c-test.c
@@ -3,6 +3,8 @@ int main(int argc, char *argv[]) {
     unsigned int bytes = 1024 * 1024;
     char *buf;
     /* allocate memory */
-    buf = alloca(bytes);
+    buf = malloc(bytes);
+    if (buf == NULL)
+        return 1;
     return 0;
}

from coccinelle, coccinelle@lwn, coccinelle for the newbie and coccinelle patch examples

Using non-blocking and asynchronous I/O (CK10 problem) in Linux and Windows (with epool, iocp, libevent/libev/libuv/boost.asio and librt/libaio)

C10k problem/C10k problem@wiki is the problem of optimizing network sockets to handle a large number of clients at the same time.

Thread per client scales only to a certain amount of clients per RAM. If you like to scale beyond that to like to minimize your state per client.

On most UNIXes, that number is around 300. On Windows, it’s around 800. I personally would only recommend it for applications that plan to handle 100 clients or fewer, or one platforms where you know the threading library works well this way.

Converting threaded programs to pure async is a disaster.For one thing, you can never, ever block under any circumstances on pain of total disaster. This means every single line of code is performance critical. For all but the most trivial applications, this alone is a deal killer.
from lkml.org

One thread per client doesn’t scale. We must serve many clients with each thread

In non-blocking IO (O_NONBLOCK) you start IO, get notified (EWOULDBLOCK) if it blocks, and readiness notify (pool, ...) to know when it's OK to start next IO. Usable in network but not disk IO.

In asynchronous/completion IO you start IO and get completion notification (signal or completion ports) to known when it finished. Works in both network and disk IO.

Edge-triggered readiness notification means you give the kernel a file descriptor, and later, when that descriptor transitions from not ready to ready, the kernel notifies you somehow. It then assumes you know the file descriptor is ready, and will not send any more readiness notifications of that type for that file descriptor until you do something that causes the file descriptor to no longer be ready (e.g. until you receive the EWOULDBLOCK error on a send, recv, or accept call, or a send or recv transfers less than the requested number of bytes).
from lkml.org

/* using edge-trigger epoll */

void setnonblocking(int fd)
    int flags = fcntl(fd, F_GETFL, 0);
    fcntl(fd, F_SETFL, flags | O_NONBLOCK);
}

/* set up listening socket, 'listen_sock' (socket(), bind(), listen()) */
epollfd = epoll_create(10);
ev.events = EPOLLIN;
ev.data.fd = listen_sock;
epoll_ctl(epollfd, EPOLL_CTL_ADD, listen_sock, &ev);

for (;;) {
    /* block until some events happens */
    nfds = epoll_wait(epollfd, events, MAX_EVENTS, -1);
    for (n = 0; n < nfds; ++n) {
        if (events[n].data.fd == listen_sock) {
            conn_sock = accept(listen_sock, (struct sockaddr *) &local, &addrlen);
            setnonblocking(conn_sock);
            ev.events = EPOLLIN | EPOLLET;
            ev.data.fd = conn_sock;
            epoll_ctl(epollfd, EPOLL_CTL_ADD, conn_sock, &ev);
        } else {
            do_use_fd(events[n].data.fd);
        }
    }
}

from epoll@man

  • Asyncronous IO port completion@wiki Windows/Solaris only. You start some operation asynchronously, and receive a notification when that operation has completed. Works in both network and disk IO.

There is a notify on ready model in Windows as well (select or WSAWaitForMultipleEvents) but it can’t scale to large numbers of sockets, so it’s not suitable for high-performance network applications.

The fundamental variation is that in a Unix you generally ask the kernel to wait for state change in a file descriptor’s readability or writablity. With overlapped I/O and IOCPs the programmers waits for asynchronous function calls to complete. For example, instead of waiting for a socket to become writable and then using send(2) on it, as you commonly would do in a Unix, with overlapped I/O you would rather WSASend() the data and then wait for it to have been sent.
from Asynchronous I/O in Windows for Unix Programmers

// TCP echo-server

DWORD WINAPI ServerWorkerThread(LPVOID CompletionPortID) {
    while(TRUE) {
        GetQueuedCompletionStatus((HANDLE)CompletionPortID, &BytesTransferred, (LPDWORD)&PerHandleData, (LPOVERLAPPED *) &PerIoData, INFINITE);
        // ...

        // continue sending until all bytes are sent
        if (PerIoData->BytesRECV > PerIoData->BytesSEND) {
            WSASend(PerHandleData->Socket, &(PerIoData->DataBuf), 1, &SendBytes, 0, &(PerIoData->Overlapped), NULL);
        } else {
            WSARecv(PerHandleData->Socket, &(PerIoData->DataBuf), 1, &RecvBytes, &Flags, &(PerIoData->Overlapped), NULL);
        }
    }
}

int main(int argc, char **argv) {
    // setup an I/O completion port
    HANDLE CompletionPort = CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0, 0);

    // create a server worker thread and pass the completion port to the thread
    HANDLE ThreadHandle = CreateThread(NULL, 0, ServerWorkerThread, CompletionPort,  0, &ThreadID);

    // create a listening socket
    // ...

    // accept connections and assign to the completion port
    while(TRUE) {
        SOCKET Accept = WSAAccept(Listen, NULL, NULL, NULL, 0);

        // associate the accepted socket with the original completion port
        LPPER_HANDLE_DATA PerHandleData = (LPPER_HANDLE_DATA) GlobalAlloc(GPTR, sizeof(PER_HANDLE_DATA));
        PerHandleData->Socket = Accept;
        CreateIoCompletionPort((HANDLE) Accept, CompletionPort, (DWORD) PerHandleData, 0);

        // create per I/O socket information structure to associate with the WSARecv
        LPPER_IO_OPERATION_DATA PerIoData = (LPPER_IO_OPERATION_DATA) GlobalAlloc(GPTR, sizeof(PER_IO_OPERATION_DATA));
        // ...
        WSARecv(Accept, &(PerIoData->DataBuf), 1, &RecvBytes, &Flags, &(PerIoData->Overlapped), NULL);    
    }
}

from IOComplete

  • libevent/libevent@wiki replaces main event loop to support execution of callbacks when a specific event occurs on a file descriptor or after a timeout.
    Its a wrapper around epoll, kqueue and IOCP.
/* TCP echo-server */

struct client { int fd; struct bufferevent *buf_ev; };

int setnonblock(int fd) {
    int flags = fcntl(fd, F_GETFL);
    flags |= O_NONBLOCK;
    fcntl(fd, F_SETFL, flags);
}

void buf_read_callback(struct bufferevent *incoming, void *arg) {
    /* echo back */
    char *req = evbuffer_readline(incoming->input);
    evreturn = evbuffer_new();
    evbuffer_add_printf(evreturn, "You said %sn",req);
    bufferevent_write_buffer(incoming, evreturn);
    evbuffer_free(evreturn);
    free(req);
}

void buf_write_callback(struct bufferevent *bev, void *arg) {}
void buf_error_callback(struct bufferevent *bev, short what, void *arg) {...}

void accept_callback(int fd, short ev, void *arg) {
    /* accept non-blocking client socket */
    int client_fd = accept(fd, (struct sockaddr *)&client_addr, &client_len);
    setnonblock(client_fd);

    /* register callbacks */
    struct client *client = calloc(1, sizeof(*client));
    client->fd = client_fd;
    client->buf_ev = bufferevent_new(client_fd, buf_read_callback, buf_write_callback, buf_error_callback, client);

    bufferevent_enable(client->buf_ev, EV_READ);
}

int main(int argc, char **argv) {
    event_init();

    /* bind, listen on non-blocking  */
    bind(socketlisten, (struct sockaddr *)&addresslisten, sizeof(addresslisten));
    listen(socketlisten, 5);
    setsockopt(socketlisten, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof(reuse));
    setnonblock(socketlisten);

    /* register callbacks and start loop */
    event_set(&accept_event, socketlisten, EV_READ|EV_PERSIST, accept_callback, NULL);
    event_add(&accept_event, NULL);
    event_dispatch();

    close(socketlisten);
    return 0;
}

from Boost network performance with libevent and libev

  • libev re-written libevent also using epoll/kqueue but no IOCP. Focus on Unix I/O multiplexers.
    For disk IO use libeio, asynchronous read, write, open, close, stat, unlink, fdatasync, mknod, readdir etc.
/* TCP echo server */

void accept_cb(struct ev_loop *loop, struct ev_io *watcher, int revents) {
    int client_sd = accept(watcher->fd, (struct sockaddr *)&client_addr, &client_len);

    /* initialize and start watcher to read client requests */
    ev_io_init(w_client, read_cb, client_sd, EV_READ);
    ev_io_start(loop, w_client);
}

void read_cb(struct ev_loop *loop, struct ev_io *watcher, int revents) {
    /* receive message from client socket */
    read = recv(watcher->fd, buffer, BUFFER_SIZE, 0);
    if(read == 0) {
        /* stop and free watchet if client socket is closing */
        ev_io_stop(loop, watcher);
        free(watcher);
        return;
    }

    /* send message bach to the client */
    send(watcher->fd, buffer, read, 0);
    bzero(buffer, read);
}

int main() {
    struct ev_loop *loop = ev_default_loop(0);

    /* bind and listen ... */

    /* initialize and start a watcher to accepts client requests */
    ev_io_init(&w_accept, accept_cb, sd, EV_READ);
    ev_io_start(loop, &w_accept);

    while (1) ev_loop(loop, 0);

    return 0;
}

from libev tcp echo server

/* TCP echo server */

uv_loop_t *loop;

void alloc_buffer(uv_handle_t *handle, size_t suggested_size, uv_buf_t *buf) {
    buf->base = (char*) malloc(suggested_size);
    buf->len = suggested_size;
}

void echo_write(uv_write_t *req, int status) { free(req); }

void echo_read(uv_stream_t *client, ssize_t nread, const uv_buf_t *buf) {
    if (nread < 0) {
        uv_close((uv_handle_t*) client, NULL);
        return;
    }

    uv_write_t *req = (uv_write_t *) malloc(sizeof(uv_write_t));
    uv_buf_t wrbuf = uv_buf_init(buf->base, nread);
    uv_write(req, client, &wrbuf, 1, echo_write);
    free(buf->base);
}

void on_new_connection(uv_stream_t *server, int status) {
    if (status == -1) return;

    uv_tcp_t *client = (uv_tcp_t*) malloc(sizeof(uv_tcp_t));
    uv_tcp_init(loop, client);
    if (uv_accept(server, (uv_stream_t*) client) == 0) {
        uv_read_start((uv_stream_t*) client, alloc_buffer, echo_read);
    }
    else {
        uv_close((uv_handle_t*) client, NULL);
    }
}

int main() {
    loop = uv_default_loop();

    uv_tcp_t server;
    uv_tcp_init(loop, &server);

    struct sockaddr_in addr;
    uv_ip4_addr("0.0.0.0", DEFAULT_PORT, &addr);

    uv_tcp_bind(&server, (const struct sockaddr*)&addr, 0);
    int r = uv_listen((uv_stream_t*) &server, DEFAULT_BACKLOG, on_new_connection);
    if (r) return 1;

    return uv_run(loop, UV_RUN_DEFAULT);
}

from uvbook

/* TCP echo server */
using boost::asio::ip::tcp;

class session : public std::enable_shared_from_this<session> {
public:
    session(tcp::socket socket) : socket_(std::move(socket)) { }
    void start() { do_read(); }
private:
    void do_read() {
        auto self(shared_from_this());
        socket_.async_read_some(boost::asio::buffer(data_, max_length), 
            [this, self](boost::system::error_code ec, std::size_t length) {
                if (!ec) { do_write(length); }
            });
    }
    void do_write(std::size_t length) {
        auto self(shared_from_this());
        boost::asio::async_write(socket_, boost::asio::buffer(data_, length),
            [this, self](boost::system::error_code ec, std::size_t /*length*/) {
                if (!ec) { do_read(); }
        });
    }
    tcp::socket socket_;
    enum { max_length = 1024 };
    char data_[max_length];
};

class server {
public:
  server(boost::asio::io_service& io_service, short port)
    : acceptor_(io_service, tcp::endpoint(tcp::v4(), port)), socket_(io_service) {
    do_accept();
  }
private:
    void do_accept() {
        acceptor_.async_accept(socket_, [this](boost::system::error_code ec) {
            if (!ec) { std::make_shared<session>(std::move(socket_))->start(); }
            do_accept();
        });
    }
    tcp::acceptor acceptor_;
    tcp::socket socket_;
};

int main(int argc, char* argv[]) {
    boost::asio::io_service io_service;
    server s(io_service, std::atoi(argv[1]));
    io_service.run();
    return 0;
}

from boost samples

  • POSIX Asynchronous I/O implemented on Linux as aio_*@man in GNU libc using pthreads, link with librt(-lrt). Works in both network and disk IO. It works on files with buffering enabled (no need for O_DIRECT), but outstanding operations queue is limited to number of threads.
#include <aio.h>
/* using signals as notification for AIO requests */
void setup_io( ... ) {
    int fd;
    struct sigaction sig_act;
    struct aiocb my_aiocb;

    /* set up the signal handler */
    sigemptyset(&sig_act.sa_mask);
    sig_act.sa_flags = SA_SIGINFO;
    sig_act.sa_sigaction = aio_completion_handler;

    /* set up the AIO request */
    bzero( (char *)&my_aiocb, sizeof(struct aiocb) );
    my_aiocb.aio_fildes = fd;
    my_aiocb.aio_buf = malloc(BUF_SIZE+1);
    my_aiocb.aio_nbytes = BUF_SIZE;
    my_aiocb.aio_offset = next_offset;

    /* link the AIO request with the signal handler */
    my_aiocb.aio_sigevent.sigev_notify = SIGEV_SIGNAL;
    my_aiocb.aio_sigevent.sigev_signo = SIGIO;
    my_aiocb.aio_sigevent.sigev_value.sival_ptr = &my_aiocb;

    /* map the signal to the signal handler */
    ret = sigaction( SIGIO, &sig_act, NULL );

    ret = aio_read( &my_aiocb );
}

void aio_completion_handler( int signo, siginfo_t *info, void *context ) {
    struct aiocb *req;
    if (info->si_signo == SIGIO) {
        req = (struct aiocb *)info->si_value.sival_ptr;
        /* did the request complete? */
        if (aio_error( req ) == 0) {
            /* request completed successfully, get the return status */
            ret = aio_return( req );
        }
    }
    return;
}

from Using POSIX AIO API

  • Native/Kernel-Linux Asynchronous I/O implemented in libaio, link with (-laio). Works only in disk IO (and only with O_DIRECT), no network IO. Will silently fallback to syncronous if underlying IO doesn’t support it.

Enables overlap I/O operations with other processing, by providing an interface for submitting one or more I/O requests in one system call io_submit without waiting for completion, and a separate interface io_getevents to reap completed I/O operations associated with a given completion group.

/* or #include <libaio.h>
int io_setup(unsigned nr, aio_context_t *ctxp) {
    return syscall(__NR_io_setup, nr, ctxp);
}
int io_destroy(aio_context_t ctx) {
    return syscall(__NR_io_destroy, ctx);
}
int io_submit(aio_context_t ctx, long nr,  struct iocb **iocbpp) {
    return syscall(__NR_io_submit, ctx, nr, iocbpp);
}
int io_getevents(aio_context_t ctx, long min_nr, long max_nr, struct io_event *events, struct timespec *timeout) {
    return syscall(__NR_io_getevents, ctx, min_nr, max_nr, events, timeout);
}*/

int main() {
    aio_context_t ctx = 0;
    struct iocb cb;
    struct iocb *cbs[1];
    char data[4096];
    struct io_event events[1];
    int ret, fd;

    fd = open("/tmp/testfile", O_RDWR | O_CREAT);
    ret = io_setup(128, &ctx);

    /* setup I/O control block */
    memset(&cb, 0, sizeof(cb));
    cb.aio_fildes = fd;
    cb.aio_lio_opcode = IOCB_CMD_PWRITE;
    cb.aio_buf = (uint64_t)data;
    cb.aio_offset = 0;
    cb.aio_nbytes = 4096;
    cbs[0] = &cb;

    ret = io_submit(ctx, 1, cbs);

    /* get the reply */
    ret = io_getevents(ctx, 1, 1, events, NULL);
    printf("%dn", ret);

    ret = io_destroy(ctx);
    return 0;
}

from Linux Asynchronous I/O Explained and AIOUserGuide

Using SO_REUSEPORT (in Linux 3.9 and BSD) in prefork multithreaded servers

The new socket option allows multiple sockets on the same host to bind to the same port, and is intended to improve the performance of multithreaded network server applications running on top of multicore systems.

Incoming connections and datagrams are distributed to the server sockets using a hash based on the 4-tuple of the connection—that is, the peer IP address and port plus the local IP address and port.

SO_REUSEADDR socket option already allows multiple UDP sockets to be bound to, and accept datagrams on, the same UDP port. However, by contrast with SO_REUSEPORT, SO_REUSEADDR does not prevent port hijacking and does not distribute datagrams evenly across the receiving threads.
from The SO_REUSEPORT socket option, Linux 3.9 introduced new way of writing socket servers

SO_REUSEADDR enables local address reuse, SO_REUSEPORT enables duplicate address and port bindings
from getsockopt@freebsd and SO_REUSEPORT vs SO_REUSEADDR

#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <stdbool.h>
#include <arpa/inet.h>
#include <pthread.h>

void* do_work(void *arg)
{
    int *port = (int *) arg;

    int listen_socket = socket(AF_INET, SOCK_STREAM, 0);
    int one = 1;
    setsockopt(listen_socket, SOL_SOCKET, SO_REUSEPORT, &one, sizeof(one));

    struct sockaddr_in serv_addr;
    memset(&serv_addr, 0, sizeof(serv_addr));
    serv_addr.sin_family = AF_INET;
    serv_addr.sin_addr.s_addr = INADDR_ANY;
    serv_addr.sin_port = htons(*port);

    int ret = bind(listen_socket, (struct sockaddr *) &serv_addr, sizeof(serv_addr)); 
    listen(listen_socket, 5);

    struct sockaddr_in cli_addr;
    memset(&cli_addr, 0, sizeof(cli_addr));
    int addr_length = sizeof(cli_addr);

    do
    {
        int cli_sock = accept(listen_socket, (struct sockaddr *) &cli_addr, (socklen_t *) &addr_length);
        close(cli_sock);
    } while (true);

    close(listen_socket);

    return 0;
}

int main(int ac, const char *av[])
{ 
    int port = atoi(av[1]);

    const int MAX_THREADS = 10;
    pthread_t tid[MAX_THREADS];
    for (int i = 0; i < MAX_THREADS; i++) {
        pthread_create(&tid[i], NULL, do_work, &port);
    }

    for (int i = 0; i < MAX_THREADS; i++) {
        pthread_join(tid[i], NULL);
    }
    return 0;
}

Using grep and its alternatives for source code (ack/ag/git-grep/cgrep/sgrep/jq/xgrep) and fuzzy searches (agrep/tre)

grep@man print lines matching a pattern. In addition, two variant programs egrep and fgrep are available. egrep is the same as grep -E. fgrep is the same as grep -F.

grep [OPTIONS] PATTERN [FILE...]

# matching control
'-E,-F,-G,-P' interpret PATTERN as extended regexp, fixed string, basic regexp (default) or perl regexp
'-i/--ignore-case' case insensitive
'-v/--invert-match'
'-w/--word-regexp' select only those lines containing matches that form whole words

# output control
'-c/--count' output only match count
'-l/--files-with-matches' output only file names
'-m/--max-count=NUM' stop at num matches
'-q/--quiet/--silent' dont write any output, exit immediately with zero status if any match is found
'--color[=always|never|auto]' surround the matched string in color

# output prefix
'-H/--with-filename' output file name for each match
'-n/--line-number' output match line number
'-A/--after-context=NUM','-B/--before-context=NUM','-C/--context=NUM' output 'NUM' lines before/after/around

# file selection
'--exclude=GLOB','--include=GLOB' exclude/include-only files whose base name matches GLOB
'-R/-r/--recursive' read files recursively

# regexp howto
'.' matches any single char
'[]' matches list of chars, eg: [:alnum:],[:digit:], '^[]' matches any chars not in
'^','$' match at begining/end
'?' '*' '+' '{n}' '{n,}' '{,m}' '{n,m}' match quantifiers
'|' matches either regexp
'()' group regexps
in basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning and must be backslashed

from grep examples and howto use grep
see why GNU grep is fast

ack/ack@man is a faster (skips unecessary files) grep like perl script optimized for code search. Searches current directory and recursively by default, ignores meta directories (.git) and binaries and backups (~), prints line numbers, highlines matches in color, supports perl regexp.

# install
$ sudo apt-get install ack-grep | sudo yum install ack (EPEL)
$(deb) sudo dpkg-divert --local --divert /usr/bin/ack --rename --add /usr/bin/ack-grep

ack [options] PATTERN [FILE...]

# matching control
'-w/--word-regexp' force PATTERN to match only whole words
'-Q,/--literal' quote all metacharacters in PATTERN, it is treated as a literal.

# file selection
'--[no]ignore-dir=DIRNAME' ignore/dont ignore directory
'--type=[no]TYPE' specify the types of files to include or exclude from a search
'--type-set=[NAME]=.[ext],.[another-ext]' adds types
'--help-type' list types

# output control
'-A/--after-context=NUM','-B/--before-context=NUM','-C/--context=NUM' output 'NUM' lines before/after/around
'-c/--count' output only match count
'--group/--nogroup' groups matches by file name

from ack@xmodulo

ag is like ack but faster, ignores ‘.gitignore,.agignore’.

# install
$ sudo apt-get install silversearcher-ag | sudo yum install the_silver_searcher (EPEL) | cinst ag (windows/chocolatery)

git-grep same as ack/ag but only for git repos.

git grep [options] [<pathspec>...]

# file selection (defaults to working directory)
'--cached' searches blobs registered in the index file
'--no-index' searches files in the current directory that is not managed by Git
'--untracked' also searches in untracked files

# matching control
'-E,-F,-G,-P' interpret PATTERN as extended regexp, fixed string, basic regexp (default) or perl regexp
'-i/--ignore-case' ignores case
'--max-depth DEPTH' decent at most DEPTH directories
'-w/--word-regexp' match the pattern only at word boundary
'-v/--invert-match' select non-matching lines
'-e,--and,--or,--nor,()' specify how multiple patterns are combined using Boolean expressions

# output control
'-c/--count' print only line number
'--color[=always|auto|never]' show colored matches
'-h/-H' suppress file name match
'-n/--line-number' prefix line number
'-q/--quiet' dont write any output, exit immediately with zero status if any match is found
'-A/--after-context=NUM','-B/--before-context=NUM','-C/--context=NUM' output 'NUM' lines before/after/around
'-p/--show-function' show preceding line with function name
'-W/--function-context' showing the whole function in which the match was found

cgrep/cgrep@ubuntu context-aware grep for source codes. Another alternative to ack/ag.

# install
$ wget https://github.com/awgn/cgrep/releases | sudo apt-get install

cgrep [OPTIONS] [ITEM]

# context filters and semantic (generic)
'-c/--code' search in source code
'-m/--comment' search in comments
'-l/--literal' search in string literals
'-S/--semantic'"code" pattern: _, _1, _2... (identifiers), $, $1, $2... (optionals), ANY, KEY, STR, CHR, NUM, HEX, OCT, OR. 
e.g. "_1(_1 && $)" search for move constructors, "struct OR class _ { OR : OR <" search for a class declaration

# search for a variable
$ cgrep -r --identifier VARname

# search recursively for headers
$ cgrep -r --header "stdio.h"

# search for call (from any struct or pointer) to 'func' with '5' as 2nd argument
$ cgrep --code --semantic '_1 . OR -> func ( _2 , 5, _3 )' file.c

# show all lines containing "sort" but no "nest" in files with an extension .c, preceded by the name of the file
$ sgrep -o "%f:%r" '"n" _. "n" containing "sort" not containing "nest"' *.c

# show the beginning of conditional statements, consisting of "if" followed by a condition in parentheses, in files *.c
# ignore "if"s appearing within comments "/* ... */" or on compiler control lines beginning with '#':
$ sgrep '"if" not in ("/*" quote "*/" or ("n#" .. "n")) .. ("(" ..  ")")' *.c

from cgrep@github

sgrep grep for structured text files. The data model of sgrep is based on regions, which are non-empty substrings of text. Regions are typically occurrences of constant strings or meaningful text elements, which are recognizable through some delimiting strings.

# install
$ sudo apt-get install sgrep | $ sudo yum install sgrep (Olea)

# show all blocks delimited by braces
$ sgrep '"{" .. "}"' file.c
# show the outermost blocks that contain "sort" or "nest"
# sgrep 'outer("{" .. "}" containing ("sort" or "nest"))' file.c

from sgrep@man

jq@github command-line JSON processor in C (no extra dependencies).
You can use it to slice and filter and map and transform structured data, alternative to awk, sed and grep.

# install
$ sudo yum install jq (EPEL) | sudo apt-get install jq

$ cat json.txt
{"name": "Google", 
 "location": {"street": "1600 Amphitheatre Parkway","city": "Mountain View", "state": "California","country": "US"},
 "employees": [{"name": "Michael","division": "Engineering"},{"name": "Laura","division": "HR"},{"name": "Elise","division": "Marketing"}]
}

# parse object
$ cat json.txt | jq '.name' 
Google

# parse nested object
$ cat json.txt | jq '.location.city' 
Mountain View

# parse array
$ cat json.txt | jq '.employees[0].name'
"Michael"

# extract specific fields from object
$ cat json.txt | jq '.location | {street, city}' 
{"city": "Mountain View","street": "1600 Amphitheatre Parkway"}

from How to parse JSON string via command line on Linux and jq tutorial

xgrep@man search content of an XML file

# install
$ sudo yum install xgrep (EPEL) | sudo apt-get install xgrep

'-x xpath' xpath specification of the elements of interest
'-s string' string format in base-element:element/regex/,element/regex/,... where base-element is the name of the elements within which a match should be attempted, the match succeeding if, for each element/regex/ pair, the content of an element of that name is matched by the corresponding regex. If multiple -s flags are specified, a match by any one of them is returned.

# find all person elements with "Smith" in the content of the name element and "2000" in the content of the hiredate element
$ xgrep -s 'person:name/Smith/,hiredate/2000/' *.xml

agrep@wiki “approximate grep” is a proprietary fuzzy grep. TRE/agrep@man is a lightweight, robust, and efficient POSIX compliant regexp matching library with some exciting features such as approximate (fuzzy) matching.

# install
$ sudo apt-get install tre-agrep | sudo yum install agrep (EPEL)
$(deb) sudo dpkg-divert --local --divert /usr/bin/agrep --rename --add /usr/bin/tre-agrep

agrep [OPTION]... PATTERN [FILE]...

# regexp selection and interpretation
'-i/--ignore-case' ignore case distinctions
'-k/--literal' treat PATTERN as a literal string
'-w--word-regexp' force PATTERN to match only whole words
'-v/--invert-match' select non-matching records instead of matching records

# approximate matching settings
'-D/–delete-cost=NUM' set cost of missing characters to NUM
'-I/–insert-cost=NUM' set cost of extra characters to NUM
'-S/-–substitute-cost=NUM' set cost of incorrect characters to NUM
Note that a deletion (a missing character) and an insertion (an extra character) together constitute a substituted character, but the cost will be the that of a deletion and an insertion added together.
'-E/--max-errors=NUM' select records that have at most NUM errors.
'-#' select records that have at most # errors (# is a digit between 0 and 9)

# output control
'--color' show colored matches
'-c/--count' print only line number
'-s/--show-cost' print match cost
'-H/--with-filename' prefix with file name
'-l/--files-with-matches' only print file name

$ tre-agrep -5 -s -i resume example.txt
2:Résumé
1:Resümee
3:rèsümê
0:Resume
5:linuxaria

from How to do fuzzy search with tre-agrep

How to write dbus service for Linux (in Python)

D-Bus@wiki is a inter-process communication (IPC) system, allowing multiple, concurrently-running computer programs (processes) to communicate with one another.

It includes a libdbus, a dbus-daemon (a message-bus daemon executable, that multiple applications can connect to) and wrappers for each application frameworks.

Select bus type: either ‘SessionBus’ each user login has a session bus, used to communicate between desktop applications; or ‘SystemBus’ global and usually started during boot, used to communicate with udev, hald, networkmanager

Register bus name: string which identifies the application or the service which is provided by the application, e.g.”org.documentroot.Calculator”

Register bus objects: identified by an object path (like a unix path), e.g.: ‘/MainApplication’

Dbus query

# show all bus names: (session bus):
qdbus
# show all objects offered by klauncher:
qdbus org.kde.klauncher
# show all methods/signals offered by the KLauncher object:
qdbus org.kde.klauncher /KLauncher

Dbus Service in Python, see dbus-python API

## install
$ sudo apt-get install python-dbus | sudo yum install dbus-python

## daemon
$ cat service.py
#!/usr/bin/env python
import dbus
import dbus.service

class SomeObject(dbus.service.Object):
    def __init__(self):
        self.session_bus = dbus.SessionBus()
        name = dbus.service.BusName("com.example.SampleService", bus=self.session_bus)
        dbus.service.Object.__init__(self, name, '/SomeObject')
    @dbus.service.method("com.example.SampleInterface", in_signature='s', out_signature='as')
    def HelloWorld(self, hello_message):
        return ["Hello", "from example-service.py", "with unique name", self.session_bus.get_unique_name()]
    @dbus.service.method("com.example.SampleInterface", in_signature='', out_signature='')
    def Exit(self):
        mainloop.quit()
if __name__ == '__main__':
    # using glib
    import dbus.mainloop.glib
    dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
    import gobject
    loop = gobject.MainLoop()
    object = SomeObject()
    loop.run()

Using qt4/gk loop

$ cat service.py
...
if __name__ == '__main__':
    # using qt4 loop    
    #import dbus.mainloop.qt
    #dbus.mainloop.qt.DBusQtMainLoop(set_as_default=True)
    #from PyQt4.QtCore import *
    #app = QCoreApplication([])
    #object = SomeObject()
    #app.exec_()

    # using gtk loop    
    #import dbus.mainloop.glib
    #dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
    #from gi.repository import Gtk
    #object = SomeObject()
    #Gtk.main()

Just run the script at startup (or login). Or send a dbus-send message to the service, and dbus will start it. It will be terminated as part of shutdown (or logout). Alternativelly Dbus-initiated start

$ cat /usr/share/dbus-1/services/sample.service
[D-BUS Service]
Name=com.example.SampleInterface
Exec="/root/service.py" 

Clients

$ qdbus com.example.SampleService /SomeObject HelloWorld "hello from cli"
$ dbus-send --session --print-reply --dest="com.example.SampleService" /SomeObject com.example.SampleInterface.HelloWorld string:"hello from cli"

$ cat client.py
#!/usr/bin/env python
import sys
from traceback import print_exc
import dbus

def main():
    bus = dbus.SessionBus()
    remote_object = bus.get_object("com.example.SampleService", "/SomeObject")
    print ' '.join(remote_object.HelloWorld("Hello from example-client.py!", dbus_interface = "com.example.SampleInterface"))
    # ... or create an Interface wrapper for the remote object
    iface = dbus.Interface(remote_object, "com.example.SampleInterface")
    print iface.HelloWorld("Hello from example-client.py!")
    # introspection is automatically supported
    print remote_object.Introspect(dbus_interface="org.freedesktop.DBus.Introspectable")
    if sys.argv[1:] == ['--exit-service']:
        iface.Exit()

if __name__ == '__main__':
    main()

Queueing: If we try to register a bus name (via dbus.service.BusName) which is already occupied, the request is silently appended to a queue and waits for the bus name to become available.

'do_not_queue=True' to disable queuing
'replace_existing=True' to try to replace the bus name if it exists
'allow_replacement=True' to allow other process to replace the newly registered bus name

Making asynchronous calls: pass ‘reply_handler’ and ‘error_handler’

$ cat client-async.py
#!/usr/bin/env python
import sys
import dbus
import dbus.mainloop.glib

def handle_hello_reply(r): print r
def handle_hello_error(e): print e
def make_calls():
    remote_object.HelloWorld("Hello from async-client.py!", dbus_interface='com.example.SampleInterface', 
        reply_handler=handle_hello_reply, error_handler=handle_hello_error)
    return False
if __name__ == '__main__':
    dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
    bus = dbus.SessionBus()
    remote_object = bus.get_object("com.example.SampleService","/SomeObject")
    import gobject
    # delay call
    gobject.timeout_add(1000, make_calls)
    gobject.MainLoop().run()

Signals are one way messages. They carry input parameters, which are received by all objects which have registered for such a signal.

$ cat signal-emitter.py
#!/usr/bin/env python
import dbus
import dbus.service
import dbus.mainloop.glib

class TestObject(dbus.service.Object):
    def __init__(self, conn, object_path='/com/example/TestService/object'):
        dbus.service.Object.__init__(self, conn, object_path)
    @dbus.service.signal('com.example.TestService')
    def HelloSignal(self, message):
        # signal is emitted when this method exits
        pass
    @dbus.service.method('com.example.TestService')
    def emitHelloSignal(self):
        # you emit signals by calling the signal's skeleton method
        self.HelloSignal('Hello')
        return 'Signal emitted'
if __name__ == '__main__':
    dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
    session_bus = dbus.SessionBus()
    name = dbus.service.BusName('com.example.TestService', session_bus)
    object = TestObject(session_bus)
    import gobject
    gobject.MainLoop().run()

$ cat signal-recipient.py
#!/usr/bin/env python
import gobject
import sys
import traceback
import dbus
import dbus.mainloop.glib

def handle_reply(msg): print msg
def handle_error(e): print str(e)
def emit_signal():
   # call the emitHelloSignal method 
   object.emitHelloSignal(dbus_interface="com.example.TestService")
                          #reply_handler=handle_reply, error_handler=handle_error)
   # exit after waiting a short time for the signal
   gobject.timeout_add(2000, loop.quit)
   return False
def hello_signal_handler(hello_string):
    print ("Received signal (by connecting using remote object) and it says: " + hello_string)
def catchall_signal_handler(*args, **kwargs):
    print ("Caught signal (in catchall handler) ", kwargs['dbus_interface'] + "." + kwargs['member'])
def catchall_hello_signals_handler(hello_string):
    print "Received a hello signal and it says " + hello_string

if __name__ == '__main__':
    dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
    bus = dbus.SessionBus()
    object = bus.get_object("com.example.TestService","/com/example/TestService/object")
    object.connect_to_signal("HelloSignal", hello_signal_handler, dbus_interface="com.example.TestService", arg0="Hello")
    #lets make a catchall
    bus.add_signal_receiver(catchall_signal_handler, interface_keyword='dbus_interface', member_keyword='member')
    bus.add_signal_receiver(catchall_hello_signals_handler, dbus_interface="com.example.TestService", signal_name="HelloSignal")
    gobject.timeout_add(2000, emit_signal)
    gobject.MainLoop().run()

from dbus-python tutorial, Dbus Tutorial – Create a service and Interprocess Communication with D-Bus and Python