## perl one-liners #misc

perl  -l0pe # el and zero, to “collapse” all input lines

Advertisements

bash check command failure but also pipe its output

I faced the same problem as described in https://stackoverflow.com/questions/1221833/pipe-output-and-capture-exit-status-in-bash.

set -o pipefail # The exit status of a pipeline is the exit status of the last command in the pipeline, unless the pipefail option is enabled.

This works with set -e.

 

perl regex top 9 tips #modifier /m /s

https://docstore.mik.ua/orelly/perl3/lperl/ch09_05.htm shows

The part of the string that actually matched the pattern is automatically stored in q[  $& ]

Whatever came before the matched section is in $` and whatever was after it is in $'. Another way to say that is that $` holds whatever the regular expression engine had to skip over before it found the match, and $' has the remainder of the string that the pattern never got to. If you glue these three strings together in order, you’ll always get back the original string.

— /m /s clarification:

  1. By default, q($) + q(^) won’t match newline. /m targets q($) and q(^)
  2. By default, the dot q(.) won’t match newline. /s targets the dot.
  3. The /m and /s both help get newlines matched, in different contexts.

Official doc says:

  1. /m  Treat the string being matched against as multiple lines. That is, change "^" and "$" from matching the start of the string’s first line and the end of its last line to matching embedded start and end of each line within the string.
  2. /s  Treat the string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match.
  3. Used together, as /ms, they let the "." match any character whatsoever, while still allowing "^" and "$" to match, respectively, just after and just before newlines within the string.

bash: split long command to multiple lines

I managed to split a huge g++ command line  to about 300 lines… much more readable.

The trick:

  • terminate each line with a space, a backslash and … NO trailing space
  • my perl one-liner must user four backslashes to insert that single backslash, then another \n
  • total there are 5 backslashes in a row.

Here’s the command

perl -pe "s/ -/ \\\\\n-/g" build1file.sh

bashslash escape: bash tricky rules

This is about shell interpreting the backslash sequence inside single-quote or double-quote.

Once bash does its parsing, it can pass the result to a command like perl or grep.

----Most escape sequences don't care about single-quote vs double-quote
$ echo "msgType\t"
msgType\t

$ echo "msgType\b"
msgType\b

# \b is meaningful in perl regex 🙂

----double backslash -- single-quote is simpler than double-quote
$ echo 'msgType\\'
msgType\\

$ echo "msgType\\"
msgType\

----single quote within single-quoted string is very tricky:
$ echo 'msgType\'\' 
msgType\'

# in the above, the last \' is a second token, a single-char string.

$ echo $'msgType\''  # dollar sign is crucial
msgType'

$ echo 'msgType\'' # somehow doesn't work without $
>

## vi (+less) cheatsheet

https://github.com/tiger40490/repo1/blob/bash/bash/vimrc has some tricks including how to make vim remember last edit location.

  • ~~~~ command mode #roughly ranked
  • [2/3] :↑ (i.e. up-arrow) — cycle through previous :commands
  • [3] dt — “dta” delete until the next “a”
  • [2]: 6x — delete 6 chars
  • [2] 9s — wipe out 9 characters (including current) and enter insert-mode. Better than R when you know how many chars (9) to change
    • to delete 5 characters … there is NO simpler keystroke sequence
  • R — Overwrite each character one by one until end of line. Useful if the replacement content is similar to original?
  • Ctrl-R to re-do
  • cw — wipe out from cursor to end of word and puts you into insert mode
    • c2w or 2cw
  • :se list (or nolist) to reveal invisible chars
  • C — wipe out from cursor to END of line and puts you into insert-mode
  • capital O — open new line above cursor
  • A — to append at END of current line
  • from inside q(LESS), type a single “v” to launch vi

–paging commands in vi and less

  • jump to end of file: capital G == in both vi and LESS
  • jump to head of file: 1G == in both vi and LESS
  • page dn: Ctrl-f == in both; LESS also uses space
  • page up: Ctrl-b == in both; LESS also uses b

— q[less] searching feature

  • after you have searched for “needle1”, how do you expand on the pattern? You can hit 2 keys
    • [2]  /↑ (i.e. <upArrow>) to load “needle1.” Now you an edit it or add an alternative like
    • [2+]/↑ (i.e. <upArrow>) |needled2|needle3

[3/4] means vi receives 3 keystrokes; we hit 4 keys including shift or ctrl …

vi on multiple files

[3/4] means vi receives 3 keystrokes; we hit 4 keys including shift or ctrl …

–“split” solution by Deepak M

vi file1 # load 1st file

  • :sp file2 # to show 2nd file upstairs
  • :vsp file3 # to show 2nd file side by side
  • You end up with  — file2 and file3 side by side upstairs, and file1 downstairs!
  • [2/3] ctrl-ww # To move cursor to the “next” file, until it cycles back

–the q( :e ) solution

vi file1 # load 1st file

  • :e file2 # to put 2nd file on foreground
  • [1/3] ctrl-^ — to switch to “the other file”
  • This solution is non-ideal for moving data between files, since you must save active file before switching and you can’t see both files

–editing 3 or more files

  1. vi file1 file2 file3
  2. q(:n) to switch to next, q(:N) for previous…
  3. q(:args) shows all files
  • –Suppose now you are at file2.
  • q(:e file4) works. q(^) will toggle between file2 and file4
  • However, q(:n :N  :args) only work on the original list, not the new files from q(:e)

q(:n :N ^) always shows the current filename in status bar:)

unix family tree #MacOS

This is academic knowledge for the self-respected techie.

https://upload.wikimedia.org/wikipedia/commons/c/cd/Unix_timeline.en.svg  and https://en.wikipedia.org/wiki/UNIX_System_V#/media/File:Unix_history-simple.svg show

  • MacOS is based on BSD
  • iOS and MacOS are  based on Darwin
    • Darwin is based on BSD
  • linux contains no BSD or Unix codebase
  • most commercial Unix versions are based on sysV

ssh host1 ssh host2 q[cmd1; cmd2]

This simple automation script demonstrates how to ssh 2 layers into a machine to run a command.

Obviously you need to set up authorized_keys.

#!/bin/bash
date=0710
tgz=nx$date.tgz
set -x
ssh -q uidev1 ssh -q bxbrdr2 "tar cfvz $tgz /data/mnt/captures/tp5/lfeeds/nysemkt/nysemkt-primary.2017${date}_03*"
ssh -q uidev1 ssh -q bxbrdr2 "hostname; ls -l ~/$tgz"
ssh -q uidev1 scp -pq bxbrdr2:$tgz .
set +x
ssh -q uidev1 "hostname; ls -l ~/$tgz"
scp -pq uidev1:$tgz .
ls -l $tgz

bash script: demo`trap,read,waiting for gdb-attach

#1 valuable feature is the wait for gdb to attach, before unleashing the data producer.

#2 signal trap. I don’t have to remember to kill off background processes.

# Better source this script. One known benefit -- q(jobs) command would now work

sigtrap(){
  echo Interrupted
  kill %1 %2 %3 %4 # q(jobs) can show the process %1 %2 etc
  set -x
  trap - INT
  trap # show active signal traps
  sleep 1
  jobs
  set +x
}

set +x
ps4 # my alias to show relevant processes
echo -e "\njobs:"
jobs
echo -en "\nContinue? [y/any_other_key] "
unset REPLY; read $REPLY
[ "$REPLY" = "y" ] || return

trap "sigtrap" INT # not sure what would happen to the current cmd and to the shell

base=/home/vtan//repo/tp/plugins/xtap/nysemkt_integrated_parser/
pushd $base
make NO_COMPILE=1 || return
echo '---------------------'
popd
/bin/rm --verbose $base/*_vtan.*log /home/vtan/nx_parser/working/CSMIParser_StaticDataMap.dat

set -x

#If our parser is a client to rebus server, then run s2o as a fake rebus server:
s2o 40490|tee $base/2rebus_vtan.bin.log | decript2 ctf -c $base/etc/cdd.cfg > $base/2rebus_vtan.txt.log 2>&1 &

#if our parser is a server outputing rtsd to VAP, then run c2o as a fake client:
c2o localhost 40492|tee $base/rtsd_vtan.bin.log | decript2 ctf -c $base/etc/cdd.cfg > $base/rtsd_vtan.txt.log 2>&1 &

# run a local xtap process:
$base/shared/tp_xtap/bin/xtap -c $base/etc/test_replay.cfg > $base/xtap_vtan.txt.log 2>&1 &
#sleep 3; echo -en "\n\n\nDebugger ready? Start pbflow? [y/any_other_key] "
#unset REPLY; read $REPLY; [ "$REPLY" = "y" ] || return

# playback some historical data, on a multicast port:
pbflow -r999 ~/captured/ax/arcabookxdp1-primary 224.0.0.7:40491 &

set +x
jobs
trap

q[return]: bash func^sourced^regular script

http://stackoverflow.com/questions/9640660/any-way-to-exit-bash-script-but-not-quitting-the-terminal says

you can use return instead of exit. Its main purpose is to return from a shell function, but if you use it within a sourced script, it returns from that script.

In a regular shell script,  “return” also kind-of works. It is illegal so it fails the script, but all previous commands actually execute as expected.

return exit
from function 🙂 probably dangerous
from sourced script 🙂 immediate disconnection 😦
from standalone script fails but OK 🙂 nice
from shell fails but OK immediate disconnection 😦

q[grep] -F -o

-F, –fixed-strings

Interpret PATTERN as a list of fixed strings, separated by new-lines, any of which is to be matched. See http://stackoverflow.com/questions/3242873/grep-for-literal-strings for examples. Useful if your needle contains meta-characters you want to treat as literals

-x, –line-regexp

Select only those matches that exactly match the whole line.

-o

output only the matched substring.

 

unix signal Lesson 1 #Trex IV

Q: why did you say the signals sent are pending and there’s a delay in response?
%%A: if the target process is executing some special instructions the kernel won’t deliver the signal.
AA: For example waiting for network/disk IO — such a process may be almost unkillable. https://major.io/2010/03/18/sigterm-vs-sigkill/ says a reboot is required.

%%A: It’s possible that when target process gets a time slice it then gets a chance to check the “signal table”.
AA: A Unix signal is picked up only at start of a time slice by the recipient “cpu-driver”. See P126[[art of debugging]].

So I gave two correct answers to Olgun of Trexquant but he didn’t acknowledge. I also described the (common) scenario of a one-core machine. The signaling process need to give up the CPU before the receiving process can react to it.

Postman drops a letter on your doorstep when you are out. This is the best way to deliver the signal. It’s uncommon (impossible?) to interrupt a process /midstream/ a sequence of instructions. I guess the reason is efficiency — no efficient implementation of such a “real-time” interrupt mechanism. In Unix, what is common is the signal mechanism.

Pending signals for a parent process is not inherited by a forked child process.

Unix signals can be generated from interactive user, from any other process, from kernel or from hardware, but they all have a target PID.

Unix Signal is at a lower level than threading. Thread preemption often depends on Unix signals

Unix Signal is a kernel-level mechanism.

Unix Signals target a process, not a thread.

A Unix Signal is an event. A Signal handler is an event callback function whose address is associated with one specific signal.

unix dirname() pitfalls

https://www.systutorials.com/qa/681/get-the-directory-path-and-file-name-from-absolute-path-linux pointed a few pitfalls:

  1. dirname() modifies the argument, so we need to duplicate the string in advance.
  2. If we omit the strdup, then … seg fault.
#include <iostream>
#include <string.h> //strdup()
#include <libgen.h> //dirname()
using namespace std;

int main(){
  char * path="/var/opt/aquis/working/security-20180108.csv.20180108";
  char * clone = strdup(path); //needed
  cout<<dirname(clone);
}

search in q[less]

  • In a q(less) screen, You can highlight not only the same AAA occurrences, but AAA or BBB. I used

/Reset|Msg_|Warn_|Primary|Secondary

  • [2/3] /ctrl-R — search without metacharacters
  • -i, then /needle — case insensitive

[3/4] means vi/less receives 3 keystrokes; we hit 4 keys including shift or ctrl …

–search can fail with super-long lines. See also http://www.greenwoodsoftware.com/less/bugs.html

  • Symptom — if you navigate to the correct region of the big text file, then the search does hit the target word.
  • Solution — use grep to confirm

 

perl q[..] range operator

there are 2 largely unrelated operators:

  • list range-operator,
  • scalar range-operator

#1 most common usage (list context) iterates over a range of integers or strings, similar to python xrange(). [1] has clear examples.

#2 (possibly more powerful) usage is during text file line-by-line processing. The operator selects a continuous chunk of lines.  Un-intuitively, this is considered scalar context. [1] has useful example, but need to understand the basics first.

#2b simplest use cases has two hardcoded numbers.

[1] http://perldoc.perl.org/perlop.html#Range-Operators Overall, this article has too many technicalities to obscure the key, useful features.

 

linux command elapse time

–q(time) command

output can be a bit confusing.

Q: does it work with redirection?

–$SECONDS variable

bookmarking in vi

— based on http://www.yolinux.com/TUTORIALS/LinuxTutorialAdvanced_vi.html

Any line can be “Book Marked” for a quick cursor return. Type the letter “m” and any other letter to identify the line. This “marked” line can be referenced by the keystroke sequence “‘” and the identifying letter.

Example: “mt” will mark a line by the identifier “t”. “‘t” will return the cursor to this line at any time. I prefer mm and ‘m

A block of text may be referred to by its marked lines. i.e.’t,’b

trailing comment in DOS batch

http://stackoverflow.com/questions/12407800/which-comment-style-should-i-use-in-batch-files

set "var=3"     %= This is a comment in the same line=%
dir junk >nul 2>&1 && %= If found =% echo found || %= else =% echo not found
The leading = is not necessary, but I like if for the symmetry.
There are two restrictions:
1) the comment cannot contain %
2) the comment cannot contain :

%%[15] bashrc

append.bashrc.txt

<![CDATA[ << ______________end_of_comment_1________________ J4 etc: to reduce section spacing modified [11 Jan 2007] ______________end_of_comment_1________________ [ -z “$(echo $- | grep i)” ] && return # # # # # # # # # # # # # # # # # # # # <> should be first # # # # # # # # # # # # # # # # # # # # # # # export PS1=’\n\s!\! \u@\h [ \t \d ] \w/ \n\$ ‘ export EDITOR=vi # primarily for sqlplus export EXINIT=” :se nu | :map v :w ” # must export LESS=”–ignore-case” export TMOUT=987654; [ $root ] && export TMOUT=100600 HISTSIZE=900 set -o emacs set +o nounset export PATH=/usr/xpg4/bin:$PATH; [ $(id -u) = 0 ] && root=yesur set +o noclobber; [ $root ] && set -o noclobber sav=$LOGNAME # # PAGER and TERM issues: # # “vt100 less -E” is ok on linux # TERM ‘linux’ breaks vi in gnome-terminal; ‘linux’ imperfect for solaris # for “less”, -E needed for dir-viewing, file-viewing but affects man # export PAGER=more export PAGERE=more if [ -z “$(which less |grep ‘^no less’)” ]; then export PAGER=”less” # breaking some x terminals but ok in “console” export PAGERE=”less -E” # -E needed for dir-viewing, file-viewing but affects man fi echo TERM=$TERM initially export TERM=vt100 # ‘linux’ breaks vi in gnome-terminal; ‘linux’ imperfect for solaris # # # # # # # # # # # # # # # # # # # # # # # # <># # # # # # # # # # # # # # # # # # # # # # # alias killthem=”perl -pe ‘s/^\S+(\s+\d+).*/\1/ or s/.*//’|xargs -pt kill” # p str1 | killthem # alias killthem=”perl -pe ‘s/\S+(\s+\S+).*/\1/s’|xargs -pt kill” # p str1 | killthem # sortable : alias %=cd_l # typing trainer alias ..=’cd_l ..’ alias cp=’cp -i’ # defuse alias egi=’export |grep -i’ alias h=history alias hgi=”history |grep -i” alias j1=”fg %1″ alias j2=”fg %2″ alias m=$PAGERE alias mv=’mv -i’ # defuse alias p=ps_grep alias path=” echo \$PATH |perl -pe ‘s/:/\n/g’ |sort -u|m ” alias rm=myrm alias s=’ source $HOME/.bashrc ‘ # .profile absent alias t=’l -t’ alias top=’prstat’ # # # # # # # # # # # # # # # # # # # # # # # # <> difficult to sort # # # # # # # # # # # # # # # # # # # # # # # cd_l(){ [ $# -eq 0 ] && l && return [ -n “$( file $* | grep directory )” ] && cd $* && l && return [ -n “$( file $* | perl -ne ‘print if /text|script/’ )” ] && m $* && /bin/echo “\n\n” && l $* } d(){ [ $# -ne 0 ] && cd $* [ `pwd` = ‘/’ ] && target=/ && echo I am in / echo “In MB :” eval du -ks * |sort -n|perl -pe ‘s|^(\d+)|$1/1000|e’ } g(){ # bug with g -v pattern=$1; shift cmd=” grep -i \”$pattern\” $* ” /usr/bin/printf ‘%s\n’ “cmd=__$cmd __” eval $cmd |$PAGERE } l(){ # can’t be completely replaced by ‘cd_l’ or ‘]’, because “cd_l -tr dir1″ is confusing and should be avoided /bin/ls -alFs $* |$PAGERE } myrm(){ cmd=”mv -i $* /var/tmp/$sav/ ” /usr/bin/printf ‘%s\n’ “cmd=__$cmd __” eval $cmd } ps_grep(){ ## cmd1=’/bin/ps -ef’ # truncation risk # ps auxwww : inconsistent for root vs non-root cmd1=’/usr/ucb/ps auxwww’ # |grep -v grep — no cos some real commands contain grep for f in $*; do cmd1=”$cmd1 | g $f” done eval $cmd1 } sav(){ suffix=$RANDOM$(date +”%H%d%b”) # to avoid misleading readers,make the suffix ambiguous for f in $*; do f=$(echo $f|perl -pe ‘s|/$||’) # sav dir/ b=`basename $f` ( cd `dirname $f` ; tar cf – $b | (cd /tmp; tar xpf -) /bin/mv -i /tmp/$b $b.b4${sav}edit$suffix [ -d $f ] && opt=’ -d’ eval ls -lt $opt $b* ) done } # # # # # # # # # # # # # # # # # # # # # <>, an exercise in grouping # # # # # # # # # # # # # # # # # # # # # # # exa(){ #testing: exa -t ss ~/.ssh local ops while is_opt=`echo $1 |perl -ne ‘print if /^-/’ ` && [ “$is_opt” ]; do ops=”$ops $1″; shift done if [ $# -eq 1 ]; then fullpath=$1 shortname=`echo $1|perl -pe ‘s|/$||;s|^.*/||’ ` else fullpath=$2 shortname=$1 fi [ -x $fullpath ] || return # export “$shortname”=”$fullpath” set $shortname=”$fullpath” prj=”$fullpath” alias “$shortname”=”cd $fullpath;l $ops” alias prj=$shortname # USE MORE } exa /tmp/ exa /etc # # # # # # # # # # # # # # # # # # # # # <># # # # # # # # # # # # # # # # # # # # # # # add_path(){ [ -r $1 ] || return [ “$(echo $PATH|grep :$1)” ] && return # check world write perm, esp for root PATH=$PATH:$1 } set_manpath(){ # very slow on some systems. Run this when u need it. for d in `ls -d /[uo]*/*/man /[uo]*/*/*/man /[uo]*/*/*/*/man`; do export MANPATH=$MANPATH:$d done } set_path(){ add_path /usr/sbin add_path /usr/bin add_path /sbin add_path /usr/local/bin add_path /usr/cluster/bin add_path /usr/openwin/bin/ add_path /usr/ucb add_path $ORACLE_HOME/bin add_path /usr/sfw/bin # pre-installed add_path /opt/csw/bin [ $root ] && return add_path . add_path $HOME/bin } set_path ]]>

vbscript can …

For localized/sandbox tasks like file processing or DB, xml…, perl and python are nice, but I feel vbscript is the dominant and standard choice for system automation. Vbscript integrates better into Windows. In contrast, on Linux/Unix, python and perl aren't stigmatized as 2nd-class-citizens

— are based on [[automating windows administration]] —

access registry

connect to exchange/outlook to send/receive mails

regex

user account operations

**query group membership

**query Active Directory

**CRUD

file operations

** size

** delete folder

** read the version of a (DLL, EXE) file

** recursively find all files meeting a (size, mtime, atime..) criteria

** write into a text file

/proc/{pid}/ useful content

Based on [[John Fusco]] —

./cmdline is a text file …
./cmd is a symlink to the current working dir of the process
./environ is a text file showing the process’s env vars
./fd/ hold symlinks to the file descriptors, including sockets
./maps is a text file showing user space memory of the process
./smaps is a text file showing detailed info on shared lib used by the process
./status is a more human-readable text file with many process details

## 10 unix signal scenarios

A signal can originate from outside the process or from within.

The precise meaning of signal generation requires a clear understanding of signal handlers. See P125 [[art of debugging]] and P280 [[linux sys programming]]

— External —
# (SIGKILL/SIGTERM) q(kill) commands
# (SIGINT) ctrl-C
# (SIGHUP) we can send this signal to Apache, to trigger a configuration reload.

— internal, i.e. some kernel code module “sends” this signal to the process committing the “crime” —
# (SIGFPE) divide by zero; arithmetic overflow,
# (SIGSEGV) memory access violation
# (SIGABRT) assertion failure can cause this signal to be generated
# (SIGTRAP) target process hitting a breakpoint. Except debuggers, every process ignores this signal.

3 types of system services ] linux

Hi Pravin,

You asked about services written in c++. I googled and found that “service” has a specific meaning in linux/unix. http://www.comptechdoc.org/os/linux/howlinuxworks/linux_hlservices.html describes 3 types of services —

  • A one time only program run at boot-up to provide a function to the system such as kudzu, or keytable.
  • A program run as a local daemon upon startup that provides system services such as gpm, autofs, cron, and atd.
  • A program run as a network-daemon upon startup that provides networking services such as dhcpd, bootparamd, arpwatch, gated, and httpd.

The way I see it (no expert here), the standard *nix network daemons provide a Generic functionality at the System level – like DHCP, DNS, http, FTP, mail etc.

In contrast, a custom c++ engine used in finance has site-specific business logic. It often binds to a standard network port. Alternatively it may use database, RPC, (Corba?), MOM, or some proprietary protocol for network communication. In some cases the communication is entirely inter-process — no networking. Just IPC like shared memory, named pipes or unix domain sockets. I don’t know which communication mode is dominant. I guess shared memory and MOM are popular.

For sys admin, I prefer command line to windowing GUI

I prefer GUI text editors (including IDE) to command line editors. I prefer GUI browsers to command line browsers. Most office applications (word processing, spreadsheet etc) must use a windowing GUI. However, for administration, like many power users I prefer the window-less command/text interface (CTI) to the Microsoft suite of control-panel screens. Note Microsoft also offers many admin tools as CTI.

– CTI output is easier to search. GUI can present hierarchical output better but requires you to click many times and you need to know where to drill down. For complex CTI tools you may need to remember a bunch of switches. Switches are easy to document. I find them easier to remember. It’s a long tradition among CTI tools to offer a combination of switches that shows just about everything in pages of text and expect users to grep. Personally I find it rather effective dealing with large-volume data output.
– CTI much easier to script and automate
– CTI easier to integrate with programming languages
– CTI generally easier to remote-execute. For windows, you need to go through a few layers of windows.
– CTI trades mouse for keyboard. Many (probably majority) Power users prefer keyboard.
– CTI often helps us see errors. In MS-win, the error can be hidden somewhere in the GUI or in the event log. To be fair, most administration tools (CTI or GUI) have difficulty communicating errors effectively.
– CTI (result) easier to compare, across versions or across machines. You just save the output in a text file and diff.
– CTI easier to redirect input/output to files. Some GUI have that support but often fail to capture Complete output, which is easy and natural in CTI. Every unix CTI lets you merge stderr and stdout to a single file.
– CTI easier to collaborate with tech support or remote colleagues
– CTI has a cleaner configuration (a kind of input). GUI uses nested panels — often hidden from view. In contrast, CTI uses either config file or nested command line switches — easily documented.

comment in DOS batch files

REM (safer than labels) may be used mid-stream within a command line. 

In contrast, Single colon or double colon Labels should always start at the first non-whitespace character in a command line.

REM Comment line 1
REM Comment line 2
:Label1
:Label2
:: Comment line 3
:: Comment line 4
IF EXIST C:AUTOEXEC.BAT REM AUTOEXEC.BAT exists

are all allowed. However, this is bad --
IF EXIST C:AUTOEXEC.BAT :: AUTOEXEC.BAT exists
See also http://www.robvanderwoude.com/comments.php

%%top 3 tips on unix permissions

Scripts must be readable AND executable [1] but compiled programs need only be executable.

[1] exception — It is possible to run a script without execute permission by entering sh myscript

You don’t have to be the owner of a file or have write permission on it to rename or delete it!  You only need write permission on the directory that contains the file.

a directory isn’t really a program that you can run even if it has execute permission.  The execute bit is *reused* (like C++ union) rather than waste space with additional permission bits.

Besides controlling a user’s ability to cd into some directory, the execute permission is required on a directory to use the stat() system call on files within that directory. This stat() returns file inode details. Therefore, to use ls -l file (i.e., to use stat() system call), you must have execute on the directory, the directory’s parent, and all ancestor directories up to and including “/” (the root directory). If execute permission is required for a directory, it is usually required for each enclosing directory component on the full path to that directory.

———- The tips below are less understood —
The execute bit on a directory is sometimes called search permission.  For example, to read a file /foo/bar, before the file can be accessed you must first search the directory foo for the inode of file bar.  This requires search (“x”) permission on the directory /foo.  (Note you don’t[2] need read permission on the directory to search in this case!  You would need read permission on a directory if you were to list its contents.)

[2] With execute but not read permission on a directory, users cannot list the contents of the directory but can access files within it if they know about them.

&& and || together

[ -f unixfile ] && rm unixfile || print “unixfile was not found, or is not a regular file” — to be tested

Finally, multiple commands can be executed based on the result of command1 by incorporating braces and semi-colons:

command1 &

If the exit status of command1 is true (zero), commands 2, 3, and 4 will be performed.

See other post(s) on exit status

perl one-liner : sum across lines

A common perl-n/perl-p challenge is to sum over loops. Solution is the END{} block.

This example below is from http://www.theperlreview.com/Articles/v0i1/one-liners.pdf

ls -lAF | perl -ne ‘   next if /^d/; $sum += (split)[4]; END{ print $sum }   ‘

(I guess this sums up all the files’… sizes?)

Note you need not pre-declare $sum as a static-local (in C lingo). I guess it’s static-local by default.

was my unix cmd successful@@

This post focuses on one question:

Q: was previous cmd successful@@
A: (based on [[ teach yourself shell programming ]])

if [ $? -eq 0 ] ….

Background: i wasn’t sure if my own solutions were reliable. Now this author also believed in [ $? -eq 0 ]

cmd1 && … # tests 0. [[ learning the bash shell ]]
cmd1 || .. # tests 0.

“0 means OK” is standard convention. A script or command can break the convention, but you are allowed to assume no one does.

deferencing a perl reference #my take

This is based on perlreftut’s excellent “Use Rule 1”.

1) To get a reference on a scalar/array/hash, put a “\” in front of the $/@/%
2) To dereference a reference, surround it with
${…..} —————————- eg: ${$a}
@{…..} —————————- eg @{$b}
%{…..} —————————- eg %{$h}

==================The longer version=================
1) To get a reference on a scalar/array/hash or even a filehandle or subroutine, put a “\” in front of the complete variable — complete with those funny characters
\$a
\@b
\%c

Prefixing is cleaner and simpler than dereferencing.

2) Think of a listRef $ref holding a string “items”. @{$ref} becomes @{items}. This is actually a real scenario, known as symbolic reference.
In other words, You can always use an array reference, in curly braces, in place of the name (without titles like “@”) of an array. The titles like “@” were left there by your predecessor before you arrived in your curly braces.

perl bless, briefly

2nd argument to bless is a …. well, a classname and also a package name. Whatever string you put there must be a valid package name (or the current package’s name if omitted). That package name is interpreted as a classname. The new object becomes an instance of that class. “The new object is blessed INTO the class”, as they like to say.

This 2nd argument is fundamental to constructor-inheritance. See P318 [[ programming perl ]]

The referent is often an empty hash. In other words, the reference to bless often points to an empty hash.

In Perl lingo, you can bless a referent or bless a reference, and everyone knows what you mean — no confusion.

Q: why do we need bless when a referent is already a chunk of memory
%%A: a bare reference can’t invoke a method. No inheritance of methods

pushd popd – top 5 tips

  • pushd dirA ###  1) push current dir, 2) push dirA, 3) Enter (i.e. cd into) dir1 i.e. the new top
  • popd ### without arg — means 1) pop 2) Enter the new top in the stack. I guess the item “removed” is forgotten completely.
  • pushd ### without arg — means 1) swap top two on the stack. 2) “Enter” (i.e. cd into) new top
 –Gory details
### I feel top item is always, always kept in sync with current dir. An understanding of the stack is essential. The stack is shared by cd, pushd, popd.
  • cd – ### a single dash — means “Enter” previous directory (in history, not the stack). Overwrite top.
  • pushd +2 ### means 1) identify the 3rd (not 2nd) on the stack. Rotate it to top. 2) Enter new top
  • cd dir1 # replace top item on the stack with dir1
  • dirs -v # print stack. Readonly operation.
  • popd +2 # identify the 3rd (not 2nd) on the stack and remove it. Don’t change current directory, since top is unaffected.
  • popd +0 # Probably pop the top. Enter the new top.

q{FIND} scan — perl,grep,notepad++

# use grep inside perl, without xargs
find .|perl -nle 'print "$_ --\n$a" if /\.(C|h)/ and $a=qx(grep -i "btmodels" "$_") '
find .|perl -nle 'print "$_ --\n$a" if !/\.git/ and $a=qx(grep -i "btmodels" "$_") '

—-windows
MSWE search is unreliable for full-text search. Ditto for MSVS search. Don’t waste time on them!
Try notepad++. You can click the search result, and get keyword highlight, like in google!
Try findstr in http://stackoverflow.com/questions/698038/windows-recursive-grep-command-line

lex^pack-var ] perl – another phrasebook

This is yet another of my attempts to extract the gist of …. a tricky area. (Page numbers refer to the camel book)

Keyword: symbol table — package vars exist on a symbol table, while lexicals don’t. As a result, package vars are accessible from anywhere by the fully qualified name, just like a file’s full path; whereas a lexical is inaccessible from outside its “home” ie lexical scope.

Keyword: fullpath

Keyword: global var — are always package vars. Contrast — A file-level lex (ie declared outside of subs) is accessible in the subs of the file, but isn’t as “global” as a package var.

Keyword: stack vars — Function local lexicals are similar to java’s stack vars. Usually [1]they lose their value when the sub returns.

[1] P223 — persistence across calls

Keyword: auto-var — I feel lexical is similar to C auto variables.

Keyword: nested sub — if a small sub is (rare!) defined completely within a bigger sub, then lexicals in the outer is visible throughout. If called sub is defined outside the caller =} invisible. Exactly like c++/java. P743.

P56 details Perl’s variable name lookup “algorithm” stroke by stroky.

When learning the (non-trvial) essentials of lex^pack-var, ignore use-strict, local or “our” for the time being.

win32 batch – nested script choices

Know the differences among these, if you have time

— directly calling another script

— CALL nestedScript.bat

— START nestedScript.bat
give birth to another Command Prompt window, then die. A bit like unix exec.

— cmd -k /nestedScript.bat

— cmd -c /nestedScript.bat

— cmd nestedScript.bat
instantiate command interpreter. This seems to be most versatile and popular

——
Just a small story – I invoked mvn.bat directly and also tried cmd. In the end, i had to use CALL

see http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/ntcmds.mspx?mfr=true

static mailer using /bin/mailx

public class StaticMailer {
private static final String tmp1 = System.getProperty("mailx");
private static final String MAILX = (tmp1 == null ? "/bin/mailx" : tmp1);

static public char send(String to, String subject, String body) {
String[] command = new String[] { MAILX, "-s " + subject, to };
return privateSend(command, body);
}

static private char privateSend(String[] command, String body) {
System.out.println(Arrays.asList(command));
try {
Process pr = Runtime.getRuntime().exec(command);
OutputStream stdin = pr.getOutputStream();
stdin.write((body + "n").getBytes());
stdin.flush();
stdin.close();
BufferedReader input = new BufferedReader(new InputStreamReader(pr.getInputStream()));
String line;
while ((line = input.readLine()) != null) {
System.out.println(line);
}
pr.waitFor();
} catch (Exception e) {
e.printStackTrace();
return 'e';
}
return '0';
}
}

linux named pipe – simple experiment – inode, mtime…

Copied from http://linuxprograms.wordpress.com/2008/02/14/fifo-named-pipes-mkfifo-mknod/

Reading/ Writing data from/to a FIFO
Let’s open two terminals
In the first terminal

$ cat > fifo

we are experimenting with the FIFOThis is second line.

After opening the fifo in the second terminal for reading using cat, you will notice the above two lines displayed there.
Now open the second terminal and go to the directory containing the FIFO ‘fifo’

$ cat fifo

we are experimenting with the FIFOThis is second line.

Now keep on writing to the first terminal. You will notice that every time you press enter, the corresponding line appears in the second terminal.

Pressing CTRL+D in the first terminal terminates writing to the fifo. This also terminates the second process because reading from the fifo now generates a “BROKEN PIPE” signal. The default action for this is to terminate the process.

Let us now see the details of the file ‘fifo’

$ ls -l fifo
prw-r--r-- 1 user user 0 Feb 14 10:05 fifo

The p in the beginning denotes that it is a pipe.

Let’s see more details about the pipe using stat

$ stat fifo
File: `fifo'Size: 0 Blocks: 0 IO Block: 4096 fifo
Device: fd00h/64768d Inode: 1145493 Links: 1
Access: (0644/prw-r--r--) Uid: ( 0/ user) Gid: ( 0/ user)
Access: 2008-02-14 10:05:49.000000000 +0530
Modify: 2008-02-14 10:05:49.000000000 +0530
Change: 2008-02-14 10:05:49.000000000 +0530

If you notice carefully, FIFOs just like a normal file possess all the details like inode number, the number of links to it, the access, modification times, size and the access permissions.

As in the case of pipes, there can be multiple readers and writers to a pipe. Try opening multiple terminals to read from and write to a pipe.

3 differences – perl vs c++

Someone asked me this questions.

My answer: strongly typed vs dynamically typed. strings and numbers.. You can have an array of mixed data types. Not practical in c++ — a subversion of strict type control.

My answer: not compiled but interpreted — run-time performance

My answer: no pointers in perl, no direct access to hardware

–other key differences I should have included
– c++ can leverage lots of legacy C code, and call C system functions directly
– c++ is  memory efficient.
– c++ object code is platforms-specific
– class-support, vtbl,
– OO supports larger projects. Somehow OO perl isn’t catching on

file descriptor redirection, exec

–Annotations on http://tldp.org/LDP/abs/html/ioredirintro.html

bash$ lsof -a -p $$ -d0,1,2
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
bash 363 bozo 0u CHR 136,1 3 /dev/pts/1
bash 363 bozo 1u CHR 136,1 3 /dev/pts/1
bash 363 bozo 2u CHR 136,1 3 /dev/pts/1

bash$ exec 2> /dev/null
bash$ lsof -a -p $$ -d0,1,2
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
bash 371 bozo 0u CHR 136,1 3 /dev/pts/1
bash 371 bozo 1u CHR 136,1 3 /dev/pts/1
bash 371 bozo 2w CHR 1,3 120 /dev/null <—

http://tldp.org/LDP/abs/html/x17601.html#USINGEXECREF shows —

exec 6>&1 # Link file descriptor #6 with stdout.
# I think this creates a new file descriptor FD#6 as alias of FD#1. FD#6 is probably a **pointer** to the in-memory object FD#1. The object IS the original file descriptor.

exec > $LOGFILE 2>&1 # stdout replaced with file “logfile.txt”.
# the object is not discarded. FD#6 still points to it, but the current process no longer uses that object.

#### this is a useful thing to put into your script, if someone calls your script.

# now the current process will use the original “object” from now on.
exec 1>&6 6>&- # Restore stdout and close file descriptor #6.

Session States in netstat output

State Description
LISTEN accepting connections
ESTABLISHED connection up and passing data
SYN_SENT TCP; session has been requested by us; waiting for reply from remote endpoint
SYN_RECV TCP; session has been requested by a remote endpoint for a socket on which we were listening
LAST_ACK TCP; our socket is closed; remote endpoint has also shut down; we are waiting for a final acknowledgement
CLOSE_WAIT TCP; remote endpoint has shut down; the kernel is waiting for the application to close the socket
TIME_WAIT TCP; socket is waiting after closing for any packets left on the network
CLOSED socket is not being used (FIXME. What does mean?)
CLOSING TCP; our socket is shut down; remote endpoint is shut down; not all data has been sent

Perl global scope and lexical scope

Nikhil,

P58 of the camel book 3rd edition (excellent summary of scoping rules) says “Although at least two different scopes (lexical and package) are active everywhere in your program, a variable can only exist in one of those scopes”.

So the 2 basic types of variable scopes are lexical and package. Package scope is also known as “global scope”.

Hope this helps, otherwise, feel free to ask me.

dtrace/truss, ptrace /proc/sys basics

On freebsd, truss works by stopping and restarting the process being monitored via ptrace()
On Solaris, truss works by stopping and restarting the process being monitored via /proc. Dtrace doesn’t stop/start a process, therefore adds lower overhead.
    
/proc is readable by cat not less.
/proc is mostly readonly, but on linux /proc/sys is writable !

http://www.linuxjournal.com/article/2365?page=0,0 is good intro. Wikipedia says

By using ptrace (the name is an abbreviation of “process trace”) one process can control another, enabling the controller to manipulate the internal state of its target. ptrace is used by debuggers such as gdb and dbx.

By attaching to another process using the ptrace call, a tool can single-step through the target’s code. The ability to write into the target’s memory allows not only its data store to be changed, but also the applications own code segment, allowing the controller to install breakpoints and patch the running code of the target.

ptrace is available as a system call on AIX, FreeBSD, Mac OS X, Linux, and HPUX up to 11. On Solaris, ptrace is implemented as a library call, built on top of Solaris kernel’s procfs filesystem; Sun notes that ptrace on Solaris is intended for compatibility, and recommends that new implementations use the richer procfs.

perl subroutine – pass by ref

J4: pbref is common in some perl code bases

–based on http://www.tek-tips.com/faqs.cfm?fid=427
an example that describes how you can pass references to a subroutine, and use those references in the subroutine to change the values of the variables that those references point to

–based on http://www.troubleshooters.com/codecorn/littperl/perlsub.htm
You can modify the original variable without creating references. (compare c# ref-parameters)
This 7-page tutorial shows how to pass scalar, list or hash as input-only or input/output args to a sub.

shadowing a package var (perl

it’s confusing to use a lexicial $v in a sub1(), and in the same sub1() use another $v.

What kind of var is the other $v? I think it has to be a package var (both $v active simultaneously in 2 worlds), NOT another lexical. I think if it’s another lexical, then u can’t access it in sub1(). It’s shadowed and can’t be unshadowed.

If (in a legacy system) it’s a package var, then u can say $main::v. I don’t see any other way. I believe use vars qw($v) won’t work.

bash – test a command is available

Based on [[ teach yourself shell programming ]]
Background — i was looking for a 1)reliable and 2) efficient solution.

type yourCmd > /dev/null 2>&1
echo $? #### should always echo a zero (ok) or non-zero (nok)

type cd > /dev/null 2>&1 && echo ok
type cdddd > /dev/null 2>&1 || echo nok

These were tested personally but some readers (see comment below) reported they don’t always work.

when (not) to use "my" in perl

A practical issue — everyday decision
An important issue — affecting maintenance, error-prone (?)
We would like simple if-then, Do/Don’t guidelines.

— Common choices for variables —
Choice: top-my ie top-level lexical . outside any subroutine.
Choice: sub-my ie subroutine-level lexical.
Choice: main-my ie lexicals in main() subroutine, which (if present) is the highest-level subroutine in a typical script.
——

Ground rule in large companies: use strict ‘vars’. No need to explain why.
guideline: short-scope scratchpad vars -> sub-my

Q: Problem with top-my?
A: No big problem. Minor problem — No warning if the variable name is in use.

Design: collect most variables as top-my

If a $startDate needs to be passed among subroutines, i feel top-my is ok.

exit status, if/while, test, &&, ||

I think most if not all shell boolean-constructs evaluate an exit-status. Here are an incomplete list of boolean-constructs — ie constructs about yes/no. Unifying them all, the 2 unifying factors are boolean and exit-status, right?

autosys failure/success? yes
the shell’s IF construct? This is the simplest, base case.
WHILE? same as IF
TEST? same as []
[]? yes. IF [ something ] evaluates the exit status of [ something ]
&& , || ? yes

See other post(s) on exit status

perl-ebcdic resources online

http://perldoc.perl.org/perlebcdic.html — Considerations for running Perl on EBCDIC platforms, not about “converting between ebcdic and ascii”

http://search.cpan.org/~cxl/Convert-EBCDIC-0.06/lib/Convert/EBCDIC.pm — Convert::EBCDIC, ascii2ebcdic, ebcdic2ascii – Perl module for string conversion between EBCDIC and ASCII

http://www.foo.be/docs/tpj/issues/vol2_4/tpj0204-0005.html — 1997 article on perl and EBCDIC. Good historical background coverage of ascii^ebcdic

vi corrupting your file

I once had a strange feed file. If I use vi to delete any line, the feed can no longer be processed by the feeder system. I suspect once vi write the file back to disk, it’s corrupted.

Solution: In my case split -1 and head -1 can do a good enough job of deleting lines. Both keep the feed file in good condition.

Q: is vi designed to edit a binary file?

Q: can perl keep all unaffected lines unaffected?

Acid test: less can show the unprintable characters and shows that after deleting one line, every other line has something missing.

"perl -p" is for one-liners only

You may be tempted to use
#!/usr/bin/perl -p

Top drawbacks, ranked
1) when complexities increase, ultimately the code maintainer will have to consider dropping -p switch and putting in a visible while() loop. Quite a few changes and lots of testing.
) what if u need to write to multiple files?
) inflexible — logic before/after the loop must be put in the BEGIN/END blocks

) the LINE label is invisible and confusing to some readers and maintainers

I think -p and -n are probably designed for
A) one-liners
B) scripts without growing complexity
C) scripts without other maintainers

perl var dumper "synopsis"

Q: For simple variables, a perl subroutine dump(‘price’) can dump @price content [1] along with the variable name — “price” in this case. But do we ever need to pass in a reference like dump(\@price, ‘price’)? [1] How about a lexical my $price declared in a nested while inside an if, wrapped in a subroutine?

A: I think sooner or later you may have to pass in ref, perhaps in a very rare and tricky context. To show the variable’s name, u need to pass 2 args in total — ref + name

[1] in dump(), print Data::Dumper->Dump (map {[$_]} @_);

studying a complex batch app (WallSt) — suggestions

Challenge: too many steps. Each usually represented by a function if well-modularized.
! Challenge: You don’t know how many of the steps are usually skipped and deserve no scrutiny.
Challenge: too many business rules
Challenge: too much branching including return/break/continue
! Challenge: Each run of the batch takes too long. run-edit-analyze cycle too long.

– Tip: identify and put aside “quick” steps in order to focus on the important steps. Subs that take a short time are usually less complicated or involve less database interaction.
– Tip: real benchmarking (or reverse engineering in general) requires good test data.
– Tip: initially, perhaps you prefer just a single record in input stream.
– Tip: if possible, output all the sql statements. If possible, Also “annouce” entry and exit of key functions, which provide context to the sql statements.
– Tip: identify non-essential yet slow steps to comment out. Non-essential = zero downstream impact
– Tip: At a key junction in an important function, when you print out a variable it’s immensely useful to see the call stack too.
– Tip: rename variables/functions — one of the safest and reversible changes (with perhaps one major side effect ie cvs diff prove …)

dump perl lexical variables

problem defined:
– Many financial perl systems “use strict”. Programmers typically use lexicals (ie “my $var;”), although “our” is sometimes good enough. By the way, “our” can introduce side effects, compromise isolation and modularization…
– Some lexicals are complex and need a deep dumper
– The extremely /handy/ sss() subroutine can’t see lexicals, although it can see locals ie dynamically scoped
– Perl restriction: P264: Lexicals are invisible by symbolic ref. Try eval
– perl restriction: lexicals are invisible, by any means, outside the lexcical scope. I think we have to keep sss() in the same source file.

Suggestion: use Data::Dumper. I think this can’t see lexicals. Try Dumper(\$your_lex);
Suggestion: our $var99 = $var; sss(‘var’); # inside sss(), we check both var99 and var
Suggestion: P743 of the camel says “unless … the subroutine itself is textually enclosed within the scope ….”. Also P222. Well, if a lexical is declared in a nested block, move sss() definition in?

Suggestion: (cheat) change my to local or our

q[SET]^q[EXPORT]^plain assignment #no-arg SET

(Not sure about csh. Let’s focus on /bin/sh and bash for now. I bet everything here also applies to ksh.)

“set” is mostly used to set options. I see it less often used to manipulate variables, which is possible but unnecessarily complicated. In fact

set a=1; echo $a # shows nothing

  • “set” alone displays variables, but “set” doesn’t modify/create variables. [1]
  • “export” has a single meaning and you see it used consistently (whereas “set” isn’t).

Are you managing variables in bash? A simple “binary” advice is
– use export for ENV variables
– use “nothing” to manage shell variable — Nothing but “a=1”

How about “let”? Only needed for arithmetic. I would say don’t let this stranger join the club and confuse yourself.

[1] In DOS, “set” command creates variables (http://www.easydos.com/set.html)

perl one-liners for xml

based on http://www.xml.com/pub/a/2002/04/17/perl-xml.html.

Find all section titles in a DocBook XML:

# cat files/mybook.xml | xpath //section/title

Retrieve just the significant text (not including nodes containing all-whitespace) from a given document:

# cat files/mybook.xml | xpath “//text()[string-length(normalize-space(.)) > 0 ]”

Save the entire data stored in the ‘users’ table as a huge file users.xml:

# sql2xml.pl -sn myserver -driver Oracle -uid user -pwd seekrit -table user -output users.xml

Pretty-print a bad xml file:

# cat overwrought.xml | xmlpretty > new.xml

Use the built-in HTML parser to convert ill-formed HTML to XML before further processing:

# xmllint –html khampton_perl_xml_17.html | xpath “//a[@href]”

AND MORE….

Perl needs OO

Q: Criteria for favoring OO over procedural Perl?
A: P 320 [[ Perl best practices ]] gave a few criteria I could identify with.

* encapsulation — between class and clients. “Implementation of individual components of the system is likely to change over time”. Hide volatile internal implementation from clients

* “Large number of other programmers will be using your code modules”

Perl batch applications]a financial IT team

Team@@ A team of contractors to UBS (Wealth Management)
Who@@ 4 dedicated Perl developers, managing 10-20 mission-critical perl applications.

Symptom@@ Always firefighting — always busy with some urgent issues.

Cause@@ Probably not all due to technical problems. Sometimes an urgent business requirement pops up and requires a quick and dirty solution. Batch solutions are often the most quick-and-dirty solution.

#1 complaint@@ maintainability — not extendable, unchangeable, inflexible

Biggest complaint against Perl@@ Reading other peopel’s perl code can be painful. UBS does have coding standards, but somehow not enfoced for these 4 Perl guys, perhaps because of constant firefighting.

#2 most common issue in the environment is related to “shared codebase”. When you modify a shared code, more than 1 system can be affected. Since the existing code is almost unreadable (according to some of the 4 guys), the impact of changes is unpredictable.

Both problems contribute to the maintenance nightmare.

My diagnosis@@ Priority set by leadership. Quick-and-dirty is the chosen priority. Perl does permit unreadable coding styles. See my other post on Perl::Critic. If leadership is ambitious and wants to support twice the amount of business requirement without increasing headcount, then maintainability could suffer.

Justification for migrating to Java batch@@ One possibility is interoperability and code sharing with non-batch java apps.

perl shift/unshift/push/pop quick intro

pop/push are more intuitive verbs. They treat an array or a list like “(0,5,2)” as a stack (eg: call stack) and operate on the RIGHT end of the stack, where the index is highest.

* In one phrase, pop/push operates on the newest element; shift/unshift on the oldest.
* In one phrase, pop/push operates on the rightmost; shift/unshift on the leftmost element.
* In one phrase, pop/push operates on the highest index, while shift/unshift on the lowest, which is zero. The highest index is both $#my_array == scalar (@my_array) -1

bash search`unix command #precedence

Q: what if you have an alias, a shell func, a shell built-in (like “echo”), a unix executable file, and your own script all sharing the same name?

Focus on 2 simple rules for now:

1) Your own aliases override everything else (such as shell functions)

2) $PATH is the very last (after things like shell built-in commands) place searched by the shell. Executables in $PATH are external to the shell.

I think the same precedence holdes for any shell.

%%perl iview q

You have done this thing for years, but iview questions may suprise you and make you look like a fake. Even a veteran need to ancipate iview questions.

  1. how many SAX xml parsers?
  2. how many DOM xml parsers?
  3. how many xslt transformers?
  4. describe some perl/xml projects
  5. query cache?
  6. how do u rollback a db tx?
  7. any way to re-use a db connection after end of script? no background process to keep the memory allocation
  8. real xp with xml parsing? zed

chmod -x /bin/chmod

Q: How to undo “chmod -x /bin/chmod” @@

A: perl has a “chmod” function. Probably the root user can do a one-liner.
A: php has a “chmod” statement. Root user may need the php command line. php also has chown()
A: python also has a chmod() method in the OS module

A: copy from another host. Check file size first.
A: recover from backup

for loop dos^shell

–dos: FOR %A IN (1 2 3 4 5 6 7 8 9 10) DO echo …

–shell:
i=4; while [ $i != 0 ]; do
i=`expr $i – 1`
done

The let command is a replacement for the old method of performing shell arithmetic using the expr command.

# Note: != syntax is easier to remember and less error-prone than -le. Now, If you must count from 1 upward,

i=1; while [ $i -le 4 ]; do
i=`expr $i + 1`
done

batch feature wishlist (WallSt) — Undo

Let’s focus on undo. Undo is esp. important for Wall St EOD batch applications because a batch owner are often unable to anticipate the full range of possible input combinations and other environmental conditions that could lead to a large number of mistakes going unnoticed. In fact, when a batch owner puts in a code fix, it’s usually after a disaster, after the batch has processed (many) unexpected records unknowingly.

* undo a specific step for all records
* undo a specific record
* undo a subset of records, based on a criteria using log analysis
* undo an entire batch

Undo implementaion? Perhaps infrastructure support needed. Look at InnoDB in Mysql — low-level support needed.

perl q[use strict vars] — tips

A one-sentence summary — somehow “mentioned” as a variable, before you can use it under strict

Longer summary — For a variable to be acceptable, either it is package-qualified or pre-declared like —
– my, our but not “local”
– mentioned in “use”, i.e. imported from a module [1]

This is a pretty good start if you can’t remember every detail about use strict ‘vars’.

[1] ok to export/import subs but beware of exporting/importing vars. P409 [[ perl best practices ]]

perl accept()

I think a common Perl socket idiom (used repeated in my scripts) is


while ( accept(Client,Server) ) {
# use Server and Client file handles
close Client;
}
close Server;

I think the while loop would run only once (P 442 [[ programming perl ]]). The first time it runs, accept() blocks until a client connects. After the connect, the loop runs for n seconds, the duration of the connection. At end of the connection, this script closes Client, exits the loop (?), and closes Server stream.

How does the server continue to service other clients? I think fork and exec is the standard solution, spawning a new process. In contrast, Java spawns a new thread — perhaps more efficient.

Remember a single-thread NIO server can handle thousands of sockets, one for each concurrent client. Even a traditional IO server can spawn hundreds of threads.

Hundreds of “processes” sounds like too heavy for an ordinary OS. Apache httpd in our servers typicall spawn dozens of child processes.

perl constructor

constructors can be named “new” or something else like “spawn” or “new2”, but probably never identical to the classname. From now, let’s assume it’s “new”.

A constructor is technically just another class[1] method. Technically the only(??) difference I can see is the last “return bless..” statement.

[1] A perl constructor is usually a class method, seldom an instance method.

2min guide2rbac]solaris10

–5 things about rbac-priv
ppriv $$ # usually shows “basic” repeatedly, and after adjustment, file_dac_read

a “primitive datatype”
can be granted to a process (lab), a user/role, a zone …

full list of prvileges? no

–5 things about rbac-role
u create roles with gui
cloeset thing is an old-fashioned user — with a home dir; u can su to it with a password;
no built-in roles
“roles” commands shows what roles u can su

full list of roles? /etc/user_attr

–5 things about rbac-rights_profile
more complex than other rbac concepts
a right is a collection of (commands,authorizations,other rights…), and not a “primitive data type”
lots of built-in right profiles like ‘System Aministrator’, ‘Primary Administrator’
sometimes referred as “profiles” or “rights”

full list of right_profiles? /etc/security/exec_attr

enforcing Perl coding standards

Large Perl teams often wish to enforce Perl “coding standards” to control bewildering style variations permitted in Perl.

Perl::Critic applies 256 coding style “policies” and outputs warnings. Management could adopt various levels of enforcements
* automate the check in an automated build/test/release/deploy process
* track violation statstics for each developer and each team
* periodic scan of codebase
* require every developer to check, and send output to a coworker for peer review

These practices are similar to taint checking, “use strict” and -w.

Teams often wish to customize or disable some “policies”. P122 [[ Mastering Perl ]]

The #1 question is “How precisely do Perl::Critic && PPI detect violations”, without false-positives, without false-negatives. How intelligent and reliable is it?

q[use strict] in perl

“use strict” is the most important pragma for defensive, maintainable coding in a large team.
I think there are 3 main things to be strict about.

1) use strict ‘refs’ # forbids symbolic references
2) use strict ‘subs’ # interprets all barewords (ie ambiguous) as syntax errors. P860 has a good first paragraph.

I think the most important “use strict” is on global variables. I guess it is
3) use strict ‘vars’ # ???? confirmed.

I think it only accepts my, local, our vars, built-in vars and *package-qualified* (ie global) vars. Now I know it also allows imported vars via

  use MyModule3 qw(sub8 $var2_with_dollar)

Footnote: Why is $var2_with_dollar rarely seen in “use” statements@@ P409 [[ perl best practices ]] warns against exporting/importing vars.

a perl package is a namespace

I don’t agree that a package, a module and a class mean roughly the same. Yes 1-to-1 mapping among the trio. In perl, the package construct is all about namespaces, while modules are more complex.

Some say a package represents a namespace; others say a package is primarily a symbol table. In simplistic terms, a package is a common prefix (or tag) you attach to a bunch of identifiers like variables, subroutines… A module is vastly different. A module is often a library of subroutines. A module can also become a class if you add some features.

@myPackage::myArray_withoutAt is a typical usage of a package.

A Namespace consists of variables, subroutines …

In Python, i’d say such a “namespace” is a idic i.e. internal-dict

prepare/execute/fetch(Perl DBI

1) the first “catch-phrase” to memorize is the most common

– – – > prepare/execute/fetch

All 3 methods are specified in DBI but implemented (vendor-specific) in DBD modules ie DBD drivers.

2) For non-select, prepare/execute will do.
3) The simplest: do() alone can replace prepare/execute.
4) prepare/bind/bind/bind/bind/bind/bind/bind/../execute/fetch
5) stored procedure with 3 result sets ie 3 selects[1]: prepare/execute /fetch/more_results/fetch/more_results/fetch

Beware that most of these methods belong to $statement_handle like

– – – – > $statement_handle -> method1

whereas a few belong to $db_handle.

[1] excluding select-into. I think a select-into always returns one(???) row and saves it into variables and does not produce a result set.

binary predictions@next Perl environment

true binary predictions = predictions that can be clearly proved right or proved wrong, never ambiguous or “outcome unknown”. Here are a few almost-binary predictions on the Perl systems in the next company. Let’s verify them 3 months into the job.

BP: Adaptability — to frequent change is one fo the biggest pains, if not the #1. Current Adaptability level isn’t state of the art.

BP: Speed — to market is likely to be the #1 metric. This depends on Adaptability.

BP: modularity, decoupling — between modules is an absolute must for Adaptability and Speed. As a large dev team, they need modularity.

P: perl module dev — Perl module is the standard solution to modular development in large teams. I’m expected to understand and use perl modules.

P: OO — Expertise with OO perl modules, designs, best-practices … is perhaps not insignificant to my role, my job performance, entire-team performance.
– > I believe OO can help Adaptability-to-change, Speed-to-market,

BP: creating Perl classes — within first month, I am not expected to do so for production systems.

P: parsing — biz rules are likely to change often. As far as parsing is concerned, Flexibility could be the #1 Speed factor, leading to fast and reliable implementations.

BP: data volume — is a non-trivial challenge for the perl scripts. Perl Modules, best-practices, designs to solve this problem will receive their attention.

P: Untuned SQL — but functional. Once found, they need remedy, and will not be ignored or accepted. Not a Binary Prediction, because some unimportant queries will definitely be accepted even if untuned.

P: test coverage — many existing non-trivial perl scripts will not have reliable test coverage written for them, but not sure if reliable test coverage is feasible

P: readability — is a non-trivial headache, even if long-tolerated, but not sure how much improvement is feasible.

BP: DB access — in this environment is a standardized and optimized routine that i don’t need to try and improve.

BP: basic XML API — in this place is a standardized and optimized routine that I need not try and improve. You may be able to put XML to new usage, though. You may innovate on a higher level.

challenge#1 in Perl-module arch

Pretend to be Larry Wall before he designed the module AR. Larry was pondering how to seamlessly integrate a module and its “client”.

Looks like Larry’s very first challenge was namespace sharing between the module and its client. A module, being a library, defines utility subrutines, to be called within client’s namespace.

This first challenge introduces us to Exporter.pm, “polluting”, packages, sybol tables ..

Perl OO quiz

Q: name the most obvious feature that set a method apart from ordinary subroutines. P 318

Q: You know that an undefined method in perl Class C will be searched in base class B. Describe in 1-2 sentences how this “method inheritance” work in perl. There’s a key data structure at the center, known as …? P321

q: Given a Person class (P332) with an age attribute, write a simple but complete getter and a setter. Also show simple but complete code to call the getter and setter.

Q: private/public members?

Q: type casting?

Read Rules, following perlreftut

Here are a couple of simple rules for reading Perl /expressions/ involving reference. Let’s start with a simple expression, like $a->[2]

Read Rule 1: “EVERYTHING on arrow’s left is a reference.”

At runtime, Perl actually follows the operator precedence rules and evaluates that reference before it looks at the right side of the arrow. In other words, Perl completely ignores the right side when evaluting the left side.

If you follow Read Rule 1 and dereference/unwrap it, you usually get an array or hash (->{….). When you are confronted with a big expression, you should follow Read Rule 1 and spell things out. See examples later.

Armed with Read Rule 1, we are ready to spell out slightly complex expressions, where we need —

Read Rule 2: “Add the omitted arrows.” ie at places like ]{,,,,,, ][,,,,,, }[ … This is the reverse of one of the original Use Rules in some “perlreftutorial”. Read Rule 2 is simple, but helpful when you need a clear understanding.

$a->[87]{str1} # # # is equivalent to
$a->[87]->{str1} # # # Read Rule 2

$everything_on_the_left_of_last_arrow = $a->[87] ; # # # Read Rule 1
%hash3 = %{$everything_on_the_left_of_last_arrow}; # # # Use Rule 1

$everything_on_the_left_of_1st_arrow = $a; # # # Read Rule 1
@array3 = @{$a} ; # # # Use Rule 1
# now $array3 [87] == $a->[87] == $everything_on_the_left_of_last_arrow == a pointer to a hash
# now the 87th element of @array3 points to %hash3