Thursday, May 17, 2012
Notes SVN
NotesSVN
Summary
Repository creation
Start the server
Start the server as a deamon
Installing SubEclipse plugin: Subversion plugin for Eclispe
Check-in project into SVN for first time, from Eclipse
Delete Project and Checkout Project from Eclipse
Other Help
Command Line
Windows - CollabnetSubversion - Upload Google Code
Visual Studio and AnkhSVN
SVN Concept
SVN status Flags
SVN server configuration workflow
Summary
========
Reference:http://svnbook.red-bean.com/en/1.1/svn-book.pdf
Subversion is installed on Fedora distributions:
svn The command-line client program.
svnversion reporting the state (in terms of revisions of the items present) of a
working copy.
svnlook inspecting a Subversion repository.
svnadmin creating, tweaking or repairing a Subversion repository.
svndumpfilter filtering Subversion repository dump streams.
mod_dav_svn plug-in module for the Apache HTTP Server, used to make your repository
available to others over a network.
svnserve custom standalone server program, runnable as a daemon process or invokable by SSH;
another way to make your repository available to others over a network.
Repository creation
===================
User "svn" created (passwd svn), mmsn's group member (chee, quilici als belong to this group). This account
will host the repository.
Log as svn user.
svnadmin create --fs-type fsfs /home/svn/mmsn
This creates the repository mmsn, that will contain all projects. Note that --fs-type fsfs force s to use fsfs instead of Berkley DB
Importing initial data
======================
0. Login as svn user
1. Create SVN style directory structure, on local machine, for the code to be imported.
$ mkdir tmpdir
$ cd tmpdir
$ mkdir BIS
$ mkdir BIS/trunk
$ mkdir BIS/branches
$ mkdir BIS/tags
$ cd BIS/trunk
$ mkdir bin build config lib src ....
$ cd ../.. i.e. go back up to ..../tmpdir from ..../tmpdir/BIS/trunk
2. Import files manually or by eclipse
svn import . file:///home/svn/mmsn --message 'Initial import'
Adding BIS
Adding BIS/trunk
Adding BIS/trunk/build
Adding BIS/trunk/build/build
Adding BIS/trunk/build/build.bat
Adding (bin) BIS/trunk/build/xml-apis.jar
Adding (bin) BIS/trunk/build/xercesImpl.jar
Adding BIS/trunk/build/build.properties
Adding (bin) BIS/trunk/build/ant-launcher.jar
Adding (bin) BIS/trunk/build/ant.jar
Adding BIS/trunk/build/build.xml
Adding BIS/trunk/config
Adding BIS/trunk/config/log4j.properties
Adding BIS/trunk/config/bisJalopy.xml
Adding BIS/trunk/lib
Adding BIS/trunk/lib/licenses
Adding BIS/trunk/lib/licenses/log4j.LICENSE
Adding (bin) BIS/trunk/lib/log4j-1.2.8.jar
Adding BIS/trunk/src
Adding BIS/trunk/bin
Adding BIS/branches
Adding BIS/tags
…
Committed revision 1.
$ cd ..
$ rm -rf tmpdir
$ cd mmsn
$ chmod -R 775 . This gives mmsn's group members r+w+x permissions in the repository
Following is not needed, just kept to remember
=====================================================================
#Start the server
#================
#svnserve -t -r /home/svn/. -t option means that we are going to use ssh to connect svn server, -r should allow to use relative paths
#when invoking the server (-r does not seem to work with -t option !!!!)
#
#Start the server as a deamon
#============================
#create a file under /etc/init.d: svnd
#
# more /etc/init.d/svnd
#!/bin/bash
#
# gdm-bootup-complete This shell script removes the boot throbber from
# the login screen
#
# chkconfig: 5 99 1
# description: This shell script removes the boot throbber from the login screen
# source function library
#. /etc/rc.d/init.d/functions
#
#start() {
# daemon --user=svn svnserve -t 1> /home/svn/svnd.log 2>&1
#}
#
#stop() {
# killall -KILL svnserve
#}
#
#case "$1" in
# start)
# start
# ;;
# stop)
# stop
# ;;
# *)
# echo $"Usage: $0 {start|stop}"
# exit 1
# esac
#
# exit 0
#
#Then add the svnd service with chkconfig:
#/sbin/chkconfig --add svnd
#/sbin/chkconfig --level 5 svnd on
#
#
#Change Security Context of some libraries (only on Linux running Fedora or Red Hat with SecureLinux turned on)
#chcon -t texrel_shlib_t /usr/local/lib/libsvn_fs_fs-1.so.0
#chcon -t texrel_shlib_t /usr/local/lib/libsvn_ra_svn-1.so.0
#chcon -t texrel_shlib_t /usr/local/lib/libsvn_repos-1.so.0
#chcon -t texrel_shlib_t /usr/local/lib/libsvn_subr-1.so.0
=====================================================================
Installing SubEclipse plugin: Subversion plugin for Eclispe
============================================================
0. If using a proxy, fill in proxy details at:
Windows -> Preference -> Install/Update
1. Install plugin from Eclipse:
a) Help -> Software Updates -> Find and Install -> Search for New Features to Install ->
New Remote Site -> Fill in URL="http://subclipse.tigris.org/update_1.0.x"
b) Click finish and follow update instructions.
c) for deetailed instructions: see http://subclipse.tigris.org/install.html
2. Change configuration in Eclipse:
Windows --> Preferences --> Team --> SVN panel, change the SVN interface to JavaSVN(Pure Java)
3. Connect using ssh
a) Window -> Show View -> Other -> SVN -> SVN Repository
b) Right click any space within SVN Repository. New -> Repository Location:
c) Enter: svn+ssh://enrol1/home/svn/mmsn (repository that is created on the SVN server already)
d) Choose authenticate by password or Key. If choose Key option, enter the location of the private-key file of the local machine. Ensure public key of local machine has been added to authorized_keys file of the remote machine.
4. SVN through http proxy
a) Goto where subversion is installed, eg:
c:\Documents and Settings\<user>\ApplicationData\Subversion\
b) Edit the file called "servers" and modify the variables fors:
http-proxy-host
http-proxy-port
Check-in project into SVN for first time, from Eclipse
=======================================================
1. In Eclipse, right click on the project to be deposited into SVN for first time.
2. From the Context menu, select Team -> Share Project
Delete Project and Checkout Project from Eclipse
=================================================
1. Go to Eclipse and click on selected Project (eg. Notes[Notes\trunk]) and close the project
2. Delete contents of the project (Make sure you have a backup of all the files somewhere else)
3. From Eclipse, New->Other->Checkout project from SVN, using location:
svn+ssh://<host>/home/svn/mmsn
... where host is the hostname (e.g. arnold)
4. Select Notes/trunk
5. Checkout using Project Wizard.
Other Help
==========
Check the manual in the Elipse's help section to get started.
Check the help manual for the commands.
Command lines
=============
To checkout the code
svn co svn+ssh://arnold/home/svn/mmsn/BIS/trunk/ BIS
To update
svn update --> check options in the manual
To rename, even when filenames have spaces
svn move svn://localhost/home/svn/QuantAnalytics/second\ text\ file svn://localhost/home/svn/QuantAnalytics/Seee
svn update
Windows - CollabnetSubversion - Upload Google Code
====================================================
Create local repository
- svnadmin create c:\svn_repo
Upload first item into local repository
- svn import u:\Notes\file.txt file:////c:/svn_repo/Notes -m "initial creation"
For sync'ing with Google the First Time, click on "reset this repository" in google code -> Source -> Checkout.
Syncing local repository to Google Code repository
- svnsync init --username <Google User name> https://<webproject>.googlecode.com/svn file:////c:/svn_repo
>>> Copied properties for revision 0.
- svnsync sync --username <Google User name> https://<webproject>.googlecode.com/svn
>>> Transmitting file data .
>>> Committed revision 1.
>>> Copied properties for revision 1.
Visual Studio and AnkhSVN
===========================
Assuming the repository has been created.
1. Create a new folder in the repository.
From VS.Net open the Repository Explorer by right click.
Create a folder called myWeb say, in the repository.
2. To map h:\myWeb, where myWeb already has contents:
Close any solutions.
Create new Project -> Other Project Types -> Visual Studio Solutions -> Blank Solutions.
Name = myWeb
Location = H:\
Add to Subversion
3. Fill in the Subversion path.
4. Start adding files.
Click on the generic "Solution" in Solution Explorer and "Add Existing Items"
SVN Concept
=============
Repository Vs Working Copy
- files that are edited are Working Copy Files.
- Working Copy is created by "checkout" the files from the Repository
File Status: Unchanged Vs Locally Changed
- "svn commit" will do nothing if the file is unchanged.
- If Locally Changed, "svn commit" will commit changes to repository unless the file is Out of Date.
File Status: Current Vs Out Of Date
- If Repository file has changed since last checkout or update, then it is Out of Date.
- "svn update" will merge local file into repository file.
svn checkout <SVN_path>/<Project Dir>
- Warning, the project path including the project name will be copied into the current directory. So ensure the current directory is the Parent of the intended project directory.
svn commit
- Push files to repository
svn update
- Pull filed from repository
SVN status Flags
================
A The file, directory, or symbolic link item has been scheduled for addition into the repository.
C The file item is in a state of conflict. That is, changes received from the server during an update overlap with local changes that you have in your working copy (and weren't resolved during the update). You must resolve this conflict before committing your changes to the repository.
D The file, directory, or symbolic link item has been scheduled for deletion from the repository.
M The contents of the file item have been modified. If you pass a specific path to svn status, you get information about that item alone:
SVN server configuration workflow
===================================
RHEL5 has got subversin 1.6.11 pre-installed for both i386 and x86_64
To use Subversion with R, the work needs to be organized a Rstudio Projects.
1. To check if the svnserver starts automatically, whether svn is preinstalled for Red Hat, or self install later, type:
ls -laF /etc/init.d/svnserve
to see if the file exists. This file runs a script to start the daemon.
2. Create svn user and group, type:
sudo useradd svn - use id svn to check user details
sudo passwd svn - make password for svn -> "svn"
sudo usermod -a -G R_POC_TEAM svn - this puts the user "svn" to the group called R_POC_TEAM.
Ensure that all other user of svn is in the same group called "R_POC_TEAM". To check which groups the users are in, type:
sudo more /etc/passwd
3. Create the SVN repository for all members of group.
- this is done only ONCE, when a new VM and subversion is installed.
- all users and the 'svn' user need to be in the same group. In this example, all these users belong to a group called R_POC_TEAM. If this is not true, make it so.
- Login as svn user.
- create the repository called QA, type:
svnadmin create --fs-type fsfs /home/svn/QA
This creates the repository QA, that will contain all projects. Note that --fs-type fsfs is the preferred filesystem for svn rather than the Berkley DB filesystem (see manual).
4. Check the file /etc/services. If it does not contain the following lines, then add them:
svn 3690/tcp # Subversion
svn 3690/udp # Subversion
5. Configuring Authentication and Authorization
- LOGIN AS USER called 'svn'
- go to the QA repository, eg:
cd /home/svn/QA
- Create a file under ...QA/conf/svnserve.conf, with the following content:
[general]
password=db = passwd
realm = QA realm
anon-access = read
auth-access = write
...... passwd is actually the word "passwd".
- Ensure the file called 'passwd' exist under ...QA/conf/ and that it has the following content:
[users]
harry = harryssecret
sally = sallyssecret
.... where harry and sally are valid users in the linux box.
6. Hide the absolute path of the svn server and provide relative paths to user only.
Edit this file: /etc/rc.d/init.d/svnserve
by adding "-r /home/svn" to the following line if it exist, so that it becomes:
args="--daemon --pid-file=${pidfile} $OPTIONS -r /home/svn"
Now when the svn server is mentioned, it is done by:
<svn server>/QA instead of <svn server>/home/svn/QA
The following information are for references:
- Access SVN. The URL pointing to the repository is:
svn://<HOST>/QA
svn://localhost/QA
- Some svn admin commands: svnadmin, svnlook, svndump, svndumpfilter, svnsync
- svnlook info <repos> - prints information for the repository at <repos>.
The information include author, date, number of lines of log, log message
- sudo killall svnserve - force termination of any active svnserve services. Also solves the error with message:
" svnserve: Can't bind server socket: Address already in use"
- sudo /etc/init.d/svnserve start - to start svn server once off.
***** How to setup iptables - NOT Needed here
http://articles.slicehost.com/2007/9/5/introduction-to-svnserve
Labels:
AnkhSVN,
CollabnetSubversion,
start daemon,
start server,
Subeclipse plugin,
Subversion,
SVN,
svnserve
Notes Linux
Contents
- 32bit vs 64bit
- Bitmap Fonts
- CDwriting
- WinXP and Lilo booting sequence
- SSH Tunnelling Enabling and VNC
- SSH X11 port forwarding
- USB Mounting and /etc/fstab editing
- Loading Profiles
- Linux Firewall
- Linux Runlevels
- Block web-ads
- SELinux Configuration
- For using SELinux with cgi-bin and files in Symbolic links
- RPM
- Installing Source RPMs
- Linux Commands
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
32bit vs 64bit
==============
CPU 32 bit -> integer pointers are 32bits
2 to power 32 = 4e+09 about 4 Giga stuff
Real*8 is 8 bytes - i.e. 64 bits but these are a separate thing
altogether from integer pointers.
Pointers (a C/C++ concept - alien to Fortran)
Consider an array of REAL*8 Arr(17:27)
there are 11 elements, i.e. from Arr(17), ...., Arr(27)
each element can be referenced with a pointer so we need 11 pointers.
but if the pointer size is 2^32 (4e+09),
so the max array size(w.r.t. #elements) possible is 4e09
Bitmap Fonts
============
Eg. Installing Korean Fonts
0. Log in as root
1. Download Korean Fonts and put into /scratch/chee/fonts/Korean-X
The files are:
hanglg16.pcf.gz hanglm16.pcf.gz hanglm24.pcf.gz
If the files are in bdf format then,
i) convert to pcf:
bdftopcf <file>.bdf
ii) compress it
gzip <file>.pcf
or
compress <file>.pcf
2. Create an index file for the fonts in that directory
mkfontdir /scratch/chee/fonts/Korean-X
this will create fonts.dir in the Korean-X directory
3. Tell the Xserver where to find the new fonts.
xset fp+ /scratch/chee/fonts/Korean-X
or
xset +fp /scratch/chee/fonts/Korean-X
to check the current X settings
xset q
to add this permanently, add the xset fp to .xinitrc
4. Refresh the Xserver setting when new fonts added if necessary
xset fp rehash
5. To view fonts in IRIX
xfd
CDwriting
=========
using cdrtools (cdrecord, mkisofs, etc..)
Ref: http://wt.xpilot.org/publications/linux/howtos/cd-writing/html/
1. Collect files and make a iso filesystem
mkisofs -r -o cd_image private_collection/
where
-r = Rockridge extns (deeper than 8 levels, permissions, etc)
-J = Joliet extensions - more MS Windows friendly
-o = output filename
-v = verbose
-x = exclude directory path
for second session only:
-C x,y = x is last session start, y is next session start
= use "cdrecord msinfo" to get -C x,y info
= must use with -M option
-M path = path where multi session is to merge. Usually path is
= place normally mount CDROM eg /mnt/cdrom
EG:
cd /scratch/chee/software
mkisofs -v -r -J -o firstImage.raw forWinPC/AdbeRdr601_enu_full.exe
*****
mkisofs 1.12.1 (i686-pc-linux-gnu)
61.14% done, estimate finish Fri Sep 10 16:32:27 2004
Total extents actually written = 8189
Total translation table size: 0
Total rockridge attributes bytes: 264
Total directory bytes: 0
Path table size(bytes): 10
Max brk space used 5544
8189 extents written (15 Mb)
*****
2. Test Image
Mount the iso image (cd_image) into the directory /cdrom, so you can check the directories.
mount -t iso9660 -o ro,loop=/dev/loop0 cd_image /cdrom
3. Writing to CD
to test the CD writer and get info
cdrecord -scanbus
to write to the CD
shell> SCSI_BUS=0 # taken from listing 1 "scsibus0:"
shell> SCSI_ID=6 # taken from listing 1 "TOSHIBA XM-3401"
shell> SCSI_LUN=0
shell> cdrecord -v speed=2 dev=$SCSI_BUS,$SCSI_ID,$SCSI_LUN \
-data cd_image
# same as above, but shorter:
shell> cdrecord -v speed=2 dev=0,6,0 -multi -data cd_image
where
-dummy = simulation (laser off)
-eject = eject after burning
-v = verbose
-multi = multi-session
-msinfo = get multi-session info
-toc = get table of contents
-data = data file (mode1 YellowBook)
-audio = audio files
speed=x = x is speed for writing
dev = 0,0,0 for parallel.hpc.unsw.edu.au
EG
cdrecord -dummy -v speed=2 dev=0,0,0 -multi -data firstImage.raw
4. For CDRW
writing the 2nd, 3rd sessions onwards
shell> NEXT_TRACK=`cdrecord -msinfo dev=0,6,0`
shell> echo $NEXT_TRACK
shell> mkisofs -R -o cd_image2 -C $NEXT_TRACK -M /dev/scd5 private_collection/ \
EG.
cdrecord -msinfo dev=0,0,0
0,19591
mkisofs -v -r -J -o secondImage -C 0,19591 -M /dev/cdrom forWinPC/Firewall
cdrecord -dummy -v speed=2 dev=0,0,0 -multi -data secondImage
WinXP and Lilo booting sequence
===============================
- Example for Slackware Linux
- use the linux diskette to startup linux
- login as root
- modify the /etc/lilo.conf file as follow:
lba32
boot = /dev/hda1
vga = 773
image = /boot/vmlinuz
root = /dev/hda1
label = Slackware <-- no space
read-only
- then do:
$ lilo -b /dev/hda1 <-- lilo on hda1
$ dd if=/dev/hda1 of=/dos/boot.lnx bs=512 count=1
$ shutdown -r now
- take out the linux boot diskette and re-boot the
system to WinXP, then do:
** copy the file from d:\dos\boot.lnx to c:\
** then, modify the file c:\boot.ini to include
following lines:
[boot loader]
timeout=30
[operating systems]
c:\boot.lnx = "Slareware Linux 9.1"
...
- re-start the Wxp and there should be an option on the Wxp
loader to boot to Linux partition.
- If you only want to boot to the Linux partition only, just set
the active partition to the Linux installation partition (in this
case, it is hda1) and reboot the Windows again.
SSH Tunnelling and VNC
======================
ssh -L x:localhost:y anotherhost
means connect to anotherhost using ssh and listen to port x on the local machine and forward any connections there to port y on anotherhost.
To start a VNC server type:
vncserver
To start a VNC viewer type:
vncviewer <name.of.remote.machine>:<display.number>
Using secure VNC with ssh, if a VNC server is running as display:1 on a system called linuxserver, you can start a secure SSH session by typing:
ssh -L 5902:localhost:5901 linuxserver
So now, connecting to port 5902 on a local machine actually connects to port 5901 on linuxserver. Thus to connect from a client machine:
vncviewer localhost:2
SSH X11 port forwarding
========================
On SSH server (remote host):
To check SSH is setup for X11 port forwarding, look at the file:
/etc/ssh/sshd_config
and see if the statement is present:
X11 Forwarding yes
then restart X11
/etc/init.d/sshd restart
On Cygwin (client):
- copy /etc/defaults/etc/ssh_config into ~/.ssh
- ensure the following are set:
ForwardX11 yes
ForwardX11Trusted yes
- DO NOT export DISPLAY=localhost:0.0
To ssh with X11 forwarding (may be different for different Unix/Linux)
ssh -X <name>@<host>
USB Mounting
=============
1. must be root to do this.
mount -t vfat /dev/sda1 /mnt/usb/
2. Alternatively, modify /etc/fstab file to have this line
/dev/sda1 /usbdisk vfat user,exec 1 1
Then any user can type
mount /usbdisk
and the files on the usb will have User's permissions
3. To unmount, type:
umount /mnt/usb/
or
umounut /usbdisk
Loading Profiles
================
/etc/profile -> /home/<user>/.profile -> /home/<user>/.bashrc
Master startup scripts are in /etc :
bashrc profile csh.cshrc csh.login
/etc/profile.d contains application specific Macro variables, eg.
java.sh java.csh
Linux Firewall
===============
/sbin/iptables --- executable
/etc/init.d/iptables --- script file
/etc/sysconfig/iptables --- configuration file
To stop firewall
/etc/init.d/iptables stop
To start firewall
/etc/init.d/iptables start
To check firewall
/sbin/iptables -L --line-numbers --- to see various rules
/sbin/iptables -nL <INPUT/OUTPUT/RH-Firewall> --- to see various rules and ports
Chains
/sbin/iptables -F [chain] --- deleting all rules in the chain
/sbin/iptables -X [chain] --- delete optional user-def chain
/sbin/iptables -Z [chain] --- to reset / zero packet and byte counter
/sbin/iptables -N [chain] --- to create new user-def chain
<Target>: {Accept, Queue, Drop, Return}
Rule-specifications
/sbin/iptables -D chain rule --- delete rule in chain
/sbin/iptables -A chain rule --- append rule to end of chain
/sbin/iptables -I chain [rulenum] rule --- insert rule at rulenum or 1st position of chain
/sbin/iptables -R chain rulenum rule --- replace rule to at rulenum positin of chain
Rules specs: -<s/d> [!] IPaddr[/mask] [!] [port[:port]]
!=negate, []=optional
Other Parameters
-i --- interface, eg eth0
-p --- protocol, eg tcp, udp, icmp, all
-s --- source (see above)
-d --- destination (see above)
-j --- jump <target>
-m state --- match with the state module
--line-numbers
<James Sainsbury help>
Example - Webserver httpd - open at port 80
iptables -A INPUT -p tcp -i eth0 --dport 80 -j ACCEPT - Open HTTP port
or
So locate the rule in RH-Firewall-1-INPUT for WWW
eg iptables --line -L RH-Firewall-1-INPUT
eg 9 ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:http
and insert a rule just after that rule (9 here) so insert before rule 10
# iptables -I RH-Firewall-1-INPUT 10 -p tcp -s 0/0 -d enrol1 -dport 8080 -m state --state NEW -j ACCEPT
If you are satisfied the rule is working save the ruleset with
/etc/init.d/iptables save
If you trash the ruleset but haven't saved just
/etc/init.d/iptables restart
(You will notice there is a preceding rule
7 ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED
which accepts incoming packets as part of an ongoing connection.)
</James Sainsbury help>
To Save iptables:
/sbin/service iptables save
My Example for opening ports on local enrol1 to communicate with BIS on eb
iptables -I RH-Firewall-1-INPUT 12 -p tcp -s eb -d enrol1 --sport 49153:49155 -m state --state NEW -j ACCEPT
- To redirect all external traffic coming into port 80 into port 8080:
1. Use firewall configuration software to accept all TCP traffic from destination port 8080
2. (as root): iptables -t nat -A PREROUTING -p tcp --dport 80 -i eth0 -j REDIRECT --to-port 8080
- To redirect internal traffic coming into port 80 to port 8080
1. use program called rinetd.
2. in the /etc/rinetd.conf, have the line
"xx.xx.xx.xx 80 xx.xx.xx.xx 8080"
3. start the rinetd service
Linux Runlevels - Auto Startup
==============================
/usr/sbin/xxxd --- daemon or startup file to run
/etc/xxx.conf --- configuration file of the progran
/var/run/xxxd.pid --- process ID of the file that run
Runlevels
0 - system shutdown
1 - single user, no networking or daemons, for maintenance
2 - multiuser but no NFS, can access network but others cannot get in
3 - normal multiuser, no X login
4 - undefined
5 - multiuser X Display manager, X login from consoles
6 - reboot
s/S - single user mode
/etc/inittab - shows how the system is initialized
/etc/rc.d/init.d/ - list of daemon files, used or not used in startup
/etc/rc.d/rc3.d/ - links to actual files in init.d, at this particular runlevel 3
Program Sequence Config file
at Startup
===========================================
/sbin/init /etc/inittab
/etc/rc.d/init.d/* /etc/sysconfig/daemons
/etc/rc.d/rcN.d/*
(Sym links of init.d)
Note the files in rcN.d are preceded by K or S and a number
K = kill job
S = start job
number = order of pograms
To check what services are running:
/sbin/chkconfig --list
service <service name> status
- Remember to check Linux Firewalls if network services do not start
- edit script files in /etc/init.d/ to contain
1. start/stop functions (and others functions restart)
2. commands in start function MUST NOT BE EXECUTABLE commands. They
should be executable scripts.
3. use daemon function to run as non-root user
4. DO NOT USE & (forking) at the end of command
5. Example:
daemon --user=<userName> <ShellScriptName> 1> job.log 2>&1
- To register
/sbin/chkconfig --level 5 initdScript on
Block web-ads
==============
Most of us can't stand getting pummeled by ads everytime we go to a website. There
are a couple of ways to stop these ads. One is to edit your /etc/hosts file and list
your loopback address, 127.0.0.1, with the ad hosting address. For example, to
block ads from ar.atwola.com open up /etc/hosts as root:
[root@blueox]# vim /etc/hosts
Next, add the following line:
127.0.0.1 ar.atwola.com
This will line will tell your browser to look on 127.0.0.1 for the ads from
ar.atwola.com. Since localhost doesn't have the ad, it won't be displated.
You can download a hosts file that has a multitude of ads to block.
Personally I DO NOT RECOMMEND THIS approach. I have noticed that
my computer slows down trying to search through the hosts file.
The RECOMMENDED approach for firefox and mozilla users is to go to
the chrome directory of your personal browser folder.
[lumbrjackedpcj@blueox]# cd ~/.mozilla/firefox/dxshwykf.default/chrome
The directory dxshwykf.default will be different for you installation.
1. First, create the file userContent.css
[lumbrjackedpcj@blueox]# touch userContent.css
2. Next, edit this file with your favorite text editor
[lumbrjackedpcj@blueox]# vim userContent.css
3. Now, following the instrunctions on http://www.mozilla.org/support/firefox/adblock,
copy this userContent.css and paste it in yours.
SELinux Configuration
======================
Intro
The latest releases of Linux distributions incorporate the mandatory access
control architecture SELinux of the NSA into the Linux kernels.
More information on SELinux can be found at the following link:
http://www.nsa.gov/selinux/info/faq.cfm
/usr/sbin/sestatus -v --- to get status of selinux
En/Disabling SELinux:
1 - in your /etc/sysconfig/selinux file, set:
SELINUX=<disabled|enforcing|permissive>
... this is not TRUE disable, it only switches off enforcing and skips loading policy.
2 - if using GRUB as bootloader
go to /boot/grub/grub.conf or menu.lst (symlink to grub.conf)
add this line to the correct boot section
selinux=<0|1>
3 - if using LILO (... to be completed ..., contributions welcome)
Error Message:
"error while loading shared libraries: /usr/lib/liba52.so.0: cannot restore segment prot after reloc: Permission denied"
then your SELinux configuration is preventing your application from launching.
Solutions:
For shared libraries failing to load, try:
chcon -t texrel_shlib_t /usr/lib/liba52.so.0
For some other libraries
chcon -c -v -R -u system_u -r object_r -t lib_t $InstallationDirectory/SystemFiles/Libraries/
# View file/directory contexts: ls -Z /var/www
# Set file/directory context:
chcon -R -h -t httpd_sys_content_t /home/user1/public_html
Options:
* -R: Recursive. Files and directories in current directory and all subdirectories.
* -h: Affect symbolic links.
* -t: Specify type of security context.
# View security context of running processes: ps -eZ
# View security context of a user: id
uid=500(user1) gid=500(user1) groups=500(user1) context=user_u:system_r:unconfined_t
# Security contexts for a new file default to that of the parent directory.
For using SELinux with cgi-bin and files in Symbolic links
============================================================
Example:
perl script: /var/www/cgi-bin/overview
Data file: /home/BIS/BIS/log/ANGLE.txt
Symbolic Link: /var/www/html/BIS/ANGLE.txt
The security context, especially the subdirectories that contain it are
as follows:
/var/www/cgi-bin/overview root:object_r:httpd_sys_script_exec_t
/var/www/html/BIS user_u:object_r:httpd_sys_content_t
/var/www/html/BIS/ANGLE.txt user_u:object_r:httpd_sys_content_t
/home system_u:object_r:home_root_t
/home/BIS user_u:object_r:user_home_dir_t
/home/BIS/BIS user_u:object_r:httpd_sys_content_t
/home/BIS/BIS/log user_u:object_r:httpd_sys_content_t
/home/BIS/BIS/log/ANGLE.txt user_u:object_r:httpd_sys_content_t
Note that it is enough to ensure the "type" context at the /home/BIS/BIS
level and its contents seem to inherit the type.
RPM
===
Redhat package manager
Query
rpm -q -a --- list all packages
rpm -q -p <package_path.rpm> --- list packages inside package_path rpm file
rpm -q -l -p <package_path.rpm> --- list all files inside package_path rpm file
rpm -q -a -i --- list info on all packages
rpm -q -a -f --- list files of all packages
Installing Source RPMs
=======================
1. Get source code in rpm form from the "SRPMS" or "SOURCE" directories
in common repositories
http://download.fedora.redhat.com/pub/fedora/linux/core/5/source/SRPMS/
2. As root, unpack the source RPMs
rpm -ivh example.src.rpm
.... this will put the patches in /usr/src/redhat/SOURCES. But these
are not the real working file.
.... there is also a build *.spec file put in /usr/src/redhat/SPECS
3. Build the source code of the packages using:
command: rpmbuild -bp --target=<arch> kernel.spec
# cd /usr/src/redhat/SPECS/
# rpmbuild -bp --target=i686 kernel-2.6.spec
.... the source may appear in /usr/src/redhat/BUILD
4. Modify and recompile source
5. Package into rpm using:
Linux Commands
===============
XConfuration:
/usr/sbin/mouseconfig ------- mouse
chsh ------------------------- to change default shell
grep chee /etc/passwd -------- tells you what shell you are from
Xconfigurator ---------------- to use XFree86
/etc/X11/XF86Config ---------- configuration file
ls /usr/lib/X11/rgb.txt ------ to see a list of X11 defined color names
/usr/sbin/timeconfig --------- change time
/usr/sbin/ntsysv ------------- choose program to
/sbin/chkconfig -------------- boot during startup
/usr/doc/HOWTO --------------- HowTo help files
Commands:
useradd **** ----------------- add user with name ****
passwd ***** ----------------- make password for user
pwd -------------------------- print working directory
su **** ---------------------- to change to another user (from superuser)
su --------------------------- superuser
/sbin/shutdown -r now -------- shutdown and reboot now
startx ----------------------- to start X windows
ps -e/f/a/l/u ---------------- process
mount /dev/fd0 /tmp ---------- mount 1st floppy drive to /tmp
umount /tmp ------------------ umount any device from /tmp
cat notes >> notes2 ---------- prints the contents of a file to standard output
and then redirects and APPEND to file "notes2"
head [-count] ***** list first few lines of file
tail [-/+ [startline] [bcl]] *** list the last few lines
wc **** ---------------------- count the number of words, lines etc in a file
alias l='ls -al' ------------- make an alias, eg. to list files
ln **file1** **file2** ------- link file2 to original file 1 (hard link)
ln -s **file1** **file2**----- do symbolic link
ls -i ------------------------ list file and look and their i-node to see linked files
rsync -auv src dest ---------- copy / update between local/remote src/dest
size ------------------------- to calculate the size of the program
wc -l/c/w **file** ----------- Word Count lines, characters, words in a file
tr --------------------------- Translate, delete or replace text in a file.
tar cf - . | zip name - ***** Tar and then zip
unzip -p backup | tar xf - *** reverse of above - ie. unzip and untar
script <filename> ------------ starts a shell - records the whole sesion - exit the shell to stop script
Help Commands:
man -k **** ------------------ to search for keywords in manual entries
man # intro ------------------ #={1..8}, 8 intro topics of Unix manual.
man -M /opt/lam/man mpirun --- manual for "mpirun" program installed in /opt/lam
rpm -qdf /usr/bin/rtin ------- query
zless /usr/doc/HOWTO/...gz --- read compressed howto files without uncompressing
locate -d <dbfile> <filename>- find a command or file called filename
updatedb --localpath=<dirPath> --output=<dbfile>
to update data file before using locate
dirPath=path of file system to put in db
dbfile=database file to log info to.
slocate ...... replaces updatedb and locate commands
/etc/cron.daily/update ------- create database before using locate
find / -name .... ------------ find the file named ....
find ~ -size 0 -print -------- find file with zero size
crontab -e ------------------- to edit the crontab for current user
Build / Compile Programs
./configure --prefix=/home/che sets the installation to your local dir so no need root permissions
Environment:
set -------------------------- to list enviroment variables
xport AUTO=VW ---------------- set the variable auto=vw
alias ls='ls -color=auto' ---- colorize ls .. put this in .profile or .bashrc
./a.out ---------------------- to execute the file "a.out"
. <command/script> ----------- execute command or script in CURRENT environment
Process Control:
kill -9 PID ----------------- sure kill the process
nohup **** ------------------ No hang-up for process ****
nice -# **** ---------------- reduce priority of command
nice --# **** --------------- increase priority of command
sleep **** ------------------ timer - stalls for certain time
at -l/r/m time date **** ---- timer - run command AT specified TIME
Compilers
cc -------------------------- SGI MIPS c compiler
CC -------------------------- SGI MIPS c++ compiler
gcc ------------------------- gnu c compiler
g++ ------------------------- gnu c++ compiler
insight --------------------- online help manual on SGI MIPS C/C++ compilers
Gnuplot
load "plotfile" ------------- runs the script file "plotfile"
set term postscript color
set output "file.ps" -------
Emacs
<shift> <left-mouse> -------- change font size
.Xdefaults ------------------ change emacs properties
M-x replace-strings --------- to replace all occurances of strings
C-q C-m --------------------- produces the letters "^M"
M-x replace-strings, C-q C-m dos2unix file conversion within emacs
C-c C-w --------------------- Fortran End of Column marker
C-c C-r --------------------- Fortran ruler
M-C-j ----------------------- Fortran continuation line
Browsers:
netscape -install ----------- make colour look better
Graphics
gif2ppm --------------------- convert gif to portable pixmap format
pnmtogif -------------------- convert any pixmap to gif
pstoimg -transparent -------- make background transparent
Secure Operations
scp newfile chee@parallel.hpc.unsw.edu.au:~/programs
Networking
nslookup www.google.com ----- returns the IP address of the URL
ping ------------------------
ifconfig <ethernet card> ---- configuration of adapter
netstat -in -----------------
echo 1 >/proc/sys/net/ipv4/ip_forward -- allow routing
echo 1 >/proc/sys/net/ipv4/conf/all/rp_filter -- prevent IP spoofing
System Admin / Setup
/etc/passwd ----------------- user details
/etc/sys_id ----------------- name of the system
/etc/hosts ------------------ list of IP address and Names
~/.rhosts ------------------- list of computers and usernames for rsh without password
background ------------------ customizing background of screen
ls -l /lib/libc-* ---------- check the glibc version
uname -r ------------------ check the kernel version
/etc/rc.d/rc.local ---------- startup programs are included here
System /Hardware Information
lsmod ----------------------- list loaded modules
lsdev ----------------------- display info about installed hardware
modinfo --------------------- information about installed modules
free ------------------------ display free/used memory on the system
more /proc/interrupts ------- view IRQ channels
more /proc/dma -------------- view dma channels
more /proc/ioports ---------- view I/O memory addresses
Change Resources
ulimit -a ------------------- check user limit
ulimit -aH ------------------ check user HARD-limit
ulimit -s 32678 ------------- set SOFT & HARD stack limit
ulimit -sS 32678 ------------ set SOFT stack limit
echo 8192 >/proc/sys/fs/file-max --- increase number of files openable
echo 16384 >/proc/sys/fs/inode-max --- increase number of files openable
File Commands
file ***** ------------------- prints the type of the file ****
find -c ---------------------- find file which has changed
stat -t ---------------------- provides info about file
diff file1 file2 ------------- output differences between files
Remote X Gui Interface
startX ----------------------- starts the X server
startkde --------------------- starts the KDE desktop
Modules
insmod ----------------------- insert module quick UNSAFE way
modprobe --------------------- insert module SAFE way
Labels:
32bit,
64bit,
Bash,
Dual booting,
firewall,
fstab,
iptables,
Lilo,
linux,
Loading Profiles,
RPMs,
runleves,
SELinux configuration,
SSH Tunnelling,
USB Mounting,
X11 Port Forwarding
Notes R
NotesR
=======
This is for Windows mainly. However, many of the notes may apply to Linux as well.
Contents
=========
Setup / Install
R Upgrade new version
R - changs in 2.15.0
Running R
Quick Commands
R.matlab
R object types
Arrays
Packages - Manual Install
Packages
R - Stat functions
R - Time Series functions
Exploratory Data Analysis
Plotting on multiple windows
Writing to HTML files
List
Sets Operations
R - functions
3D plots
Plotting features
ECDF ecdf Empirical CDF
data.frames
Reading Data Files
serialIndepTest
[R] Operators
RUnit - Testing
Linear Least Squares Regression
R Interoperability
Calling Fortran from R
Fitting Distributions using VGAM::vglm and others
Error Handling
Sweave
Global variables and scope
Setup / Install
================
Mirror Site: http://cran.ms.unimelb.edu.au/
Installed in directory:
C:\Program Files\R\R-2.12.1\bin\i386
To install additional packages, use the GUI to download and install.
Help files:
http://127.0.0.1:21879/library/utils/html/help.html
C:\Program Files\R\R-2.12.1\doc\html\index.html
Appendix A in manual is a helpful starter.
Path - typical installation path is: C:\Program Files\R\R-2.12.1\bin
Include the path in the PATH variable or another envrionment variable
R Upgrade new version
======================
The R installation is setup such that multiple versions can exist on Windows.
For example it is possible to have installations at:
C:\program files\R\R-2.12.1
C:\program files\R\R-2.12.2
To perform the upgrade, suppose R-2.12.1 already exist and you wanted the new version R-2.12.2. The steps are:
1. Download R-2.12.2-win.exe
2. Close any R applications that is running.
3. Install by running R-2.12.2-win.exe
4. In Windows Explorer, go to C:\program files\R\R-2.12.1\library.
Highlight all folders and COPY (Ctrl-C)
5. Go to the new installation path C:\program files\R\R-2.12.2\library.
Paste (Ctrl-V) BUT DO NOT OVERRIDE existing folders.
This action will copy the libraries YOU installed in the old version to the new version BUT does not delete the base libraries that exist already in the new installation.
6. Run R-2.12.2, go to Packages -> Update packages...
7. If you wish, you can try uninstalling the older version. Please note that I have not done this so I cannot say if this step will affect the new version or not.
R - changes in 2.15.0
======================
Some changes to R functions which may are incompatible with previous versions are:
old: fisk(link.a = "loge", link.scale = "loge", earg.a=list(), earg.scale=list(), init.a = NULL, init.scale = NULL, zero = NULL)
new: fisk(lshape1.a = "loge", lscale = "loge", eshape1.a = list(), escale = list(), ishape1.a = NULL, iscale = NULL, zero = NULL)
old: VGAM::psinmad( a=..., q.arg=...) changed to
new: VGAM::psinmad( shape1.a=..., shape3.q=...)
old: VGAM::qfisk( a=....)
new: VGAM::qfisk( shape1.a=...)
Ref: http://www.r-project.org/
C-LEVEL FACILITIES:
o Passing R objects other than atomic vectors, functions, lists and
environments to .C() is now deprecated and will give a warning.
Most cases (especially NULL) are actually coding errors. NULL
will be disallowed in future.
.C() now passes a pairlist as a SEXP to the compiled code. This
is as was documented, but pairlists were in reality handled
differently as a legacy from the early days of R.
o call_R and call_S are deprecated. They still exist in the
headers and as entry points, but are no longer documented and
should not be used for new code.
Running R
==========
Two ways to use the R software
1. Rgui.exe - this is the windowing environment.
2. Rterm.exe - this is a command terminal environment.
- run by typing in any cmd terminal:
C:\Program Files\R\R-2.12.1\bin\i386
3. Running in batch mode:
Rterm.exe --no-restore --no-save < infile > outfile 2>&1
Alternatively, creaate a file called R.bat with the following content:
"c:\Program Files\r\R-2.13.1\bin\x64\Rterm.exe" --vanilla
To run,
R < test.R > test.out 2>&1
Or,
start cmd /K "R.bat < aaa.R > aaa.out 2>&1" - new CMD window remains open
start cmd /C "R.bat < aaa.R > aaa.out 2>&1" - new CMD window closes
4. Running scripts from Windows CMD terminal.
- set windows environment variable to installation path, eg:
Rbin=C:\Program Files\R\R-2.12.1\bin
- go to the dir with R file eg file called Initialize.R
- Run: "%Rbin%\Rscript" Initialize.R
- OR: START "runMe" cmd /K "%rbin%\Rscript" runMe.R
5. Running scripts from inside an R console.
- source("Initialize.R", print.eval = TRUE)
the print.eval = TRUE ensures that evaluation in script file is printed to screen
6. Running Parallel from Windows CMD terminal
- START "someName" "%rbin%\Rscript" someScript1.R
- START "someName" "%rbin%\Rscript" someScript2.R
the START is a windows command that forks open a separtate Windows terminal
Quick Commands
===============
license(), licence() - for distribution details.
contributors() - list of contributors
citation() - on how to cite R or R packages in publications.
demo() - for some demos
help() - for on-line help
help.start() - for an HTML browser interface to help.
help(par), ?par - help for function par
??par - multiple lists of help for par
q() - to quit R.
<command> - typing just the command line reveals the code
<command()> - executes the function.
ls() - list variables in the workspace
objects() - list of objects in the workspace
rm(a, b) - removes object a and b in the workspace
rm(list=ls()) - WARNING: removes all variables
rm(a, envir = globalenv()) - removes object from the Global Environment
search() - list the loaded packages
getAnywhere(x) - find which packages x belong too, good when multiple packages have same function name.
# - comments symbol
list.files() - list the files in the current directory
getwd() - get the path of the current directory\
setwd("c:\\data") - sets the working directory to c:\data
print(<r commands>) - prints output to screen, useful for R scripts.
paste("aa",xx,sep="") - concatenate strings; note xx may be a number
match(), pmatch(), grep - string matching
y<-unlist(strsplit(x, "_")) - parse or split the string x, using delimiter as "_". Access each item by y[0], y[1], etc.
which(data==const) - mask. Returns the indices where condition is true.
which(data==const, arr.ind=TRUE) Returns indices properly according to dimensions. If arr.ind=FALSE (default),
then even for multi dim array, it returns the index cumulated in ONE dimension only.
which.max - get the index of the maximum value of a vector.
max.col - get the index of the maximum value of a matrix (have not tried)
options(digits=xxx) - change precision display, where xxx is the number of digits required.
format(aaa, digits=9) - change precision display
sprintf("%.14f",aaa) - change precision display
calibrate::textxy - special text formatting (also see text in core package)
source("filename.R") - Batch: runs commands in file filename.R
sink("filename.R") - Batch: outputs to file filename.R, MUST USE PRINT COMMAND to print to file using SINK.
sink() - return output to terminal
write(t(x), file="a.txt", ncolumns=31, append=TRUE, sep="\t")
- write 2D matrix x (need transpose t), to file called a.txt, display in 31 columns, with tab separation and append to existing file.
- write to console by putting argument file="".
save(x, file=' ') - to save a variable to file, in binary format. See sink() command to save file in text format.
load(file=' ') - to load data, in binary format. Note: Variables are automatically loaded and overwrite existing variables.
file.exists('filepath') - checks if a file exists and returns TRUE/FALSE.
memory.size() - set/get the memory size in MB
gc() - garbage collection
a %in% b - test to see if element a is in vector b.
compare(a,b) - Package: compare. Compares objects a and b to see if they are the same.
runif(n, min=0, max=1) - Random Uniform Distribution - draw n points randomly.
set.seed(x) - Sets the seed for random generators.
table(xWords) - split data into frequency bins
c(rbind(vecA, vecB)) - interleave 2 vectors of data
file.exists('filename') - checks if the file exists
Sys.Date() - returns system date in format 'yyyy-mm-dd'
Sys.time() - returns date and time in format 'yyyy-mm-dd hh:mm:ss: EST'
system.time([blah]) - time it takes to execute blah.
process.time() - call this, run R statement, then call again and take difference between two times.
gc.time() - garbage collection time.
str(xxx) - information about the components of xxx
summary(xxx) - information summary about xxx
names(xxx) - information about the names of the components of xxx.
t(x) - Transpose of matrix x.
m1 <- matrix(1:12, nc=3) - create a matrix with 3 columns using the 12 data points
Data are distributed columnwise by default.
m1 <- m1[-4,] - remove the 4th row from a matrix.
m2 <- rbind(m1, c(1,2,3)) - add items 1,2,3 as a new row of matrix m1.
1:5 - like linspace, generates {1,2,3,4,5}
seq(a,b, by=c) - like linspace, generates sequence from a to b with step c.
rep(x, [times|each]) - replicates vector x with various options.
a %% b - the remainder of the number of times a is divisible by b
a %/% b - the modulus or the number of times a is divisible by b
dyn.load("a.dll", local=FALSE) - load the dll into R. The local=FALSE puts it in Global space rather than local space.
dyn.unload() - unload the dll from R, required if the dll needs to be recompiled.
is.loaded() - check whether a subroutine name has been loaded into R.
R.version - to list the system information R is run on.
sapply(X, FUN, ...) - a user friendly version of lapply. It applies a function to the list of objects.
- if X is a list of collection, eg x <- list(a = 1:10, beta = exp(-3:3), logic = c(TRUE,FALSE,FALSE,TRUE)) - then sapply(x, mean) will return 3 means, one for each group.
- if X is a vector of elements, then the function is applied to each element.
http://www.ats.ucla.edu/stat/r/library/advanced_function_r.htm
as.vector - vectors have NO dimension. Can us this to make num[1:14(1d)] into num[1:14]
months(aDate) - return Month in characters, abbreviate=TRUE to get short version.
as.Date(aDate, format) - translate in R Date object for input with specified format.
for(ii in 1:10){} - for loop control structure
for(ii in c(1:10){} - for loop control structure - using a numbers from a list
next - to skip to next iteration in for loop.
break - to break out of a for loop.
is.null(arg) - tests if the argument is NULL
is.nan(arg) - tests if the argument is NaN
is.finite(arg) - tests if the argument is finite
is.infinite(arg) - tests if the argument is infinite
.Machine$double.eps - gives machine EPSILON value
exists('aVar') - tests if the variable called aVar is defined or not.
knots(ecdf(X)) - retrieves distinct values of ecdf
setdiff(A,B) - data in A, not in B
R.matlab
=========
it appears the R package is not compatible with all versions of .mat file
you must save file as v6: save('D:\DATA\QARiskApp\mloss-v6.mat','mLoss','mLogLoss','-v6')
readMat(file.path("<mat file>")) reads the contents of "mat file"
bs <- readMat(file.path("<mat file>")) reads the contents of "mat file" and store in a structure "bs"
If the data stored in BS are made of many different types of data, eg integer, vectors, and come in different shapes, eg scalar or vectors, then the relevant commands are:
ll - info about the general structures available
bs[[1]] - first item of the structure. This can be scalar, 1D, 2D, etc matrices.
bs$<var1> - same as above, accesses the first component of the structure.
R object types
===============
Ref: http://cran.r-project.org/doc/contrib/R_language.pdf
http://cran.r-project.org/doc/manuals/R-lang.html#Basic-types
The types of objects are:
vectors, arrays, factors, lists, data frames, functions
Intrinsic attributes of objectts are:
mode(object)
length(object)
To change the type of an object, use:
as.characters(object)
as.integers(object)
as.<etc..>(object)
Objects can change size dynamically, or
by using length: length(object) <-3 # sets the array to new size 3
by resizing itself: object <- object[2* 1:5]
Objects have the following attributes: class, comment, dim, dimnames, names, row.names and tsp
attributes(object) lists all the attributes
attr(object, "dim") get or set the attribute "dim"
Arrays
=======
Any vector can be made into array if its "dim" attribute is defined. This gives the object a multi-dimensional shape.
The various ways of defining arrays are:
Z <- array(h, dim=c(3,4,2))
Z <- h ; dim(Z) <- c(3,4,2)
scal = "blah" ; zz <- array(scal, dim=c(3)); # resize into zz = { blah, blah, blah }
Z <- array(0, dim=0) # to create an array of zero length, it is of type numeric(0)
Outer product can be defined as:
ab <- a %o% b
ab <- outer(a, b, "*")
f <- function(x, y) cos(y)/(1 + x^2); z <- outer(x, y, f)
tapply() function - provides masking functionality.
Example:
state <- c("tas", "sa", "qld", "nsw", "nsw", "nt", "wa", "wa",
"qld", "vic", "nsw", "vic", "qld", "qld", "sa", "tas",
"sa", "nt", "wa", "vic", "qld", "nsw", "nsw", "wa",
"sa", "act", "nsw", "vic", "vic", "act")
statef <- factor(state)
> statef
[1] tas sa qld nsw nsw nt wa wa qld vic nsw vic qld qld sa
[16] tas sa nt wa vic qld nsw nsw wa sa act nsw vic vic act
Levels: act nsw nt qld sa tas vic wa
levels(statef)
[1] "act" "nsw" "nt" "qld" "sa" "tas" "vic" "wa"
incomes <- c(60, 49, 40, 61, 64, 60, 59, 54, 62, 69, 70, 42, 56,
61, 61, 61, 58, 51, 48, 65, 49, 49, 41, 48, 52, 46,
59, 46, 58, 43)
incmeans <- tapply(incomes, statef, mean)
giving a means vector with the components labelled by the levels (using function mean())
act nsw nt qld sa tas vic wa
44.500 57.333 55.500 53.600 55.000 60.500 56.000 52.250
stderr <- function(x) sqrt(var(x)/length(x)) # user defined function
incster <- tapply(incomes, statef, stderr)
> incster
act nsw nt qld sa tas vic wa
1.5 4.3102 4.5 4.1061 2.7386 0.5 5.244 2.6575
Subarray - let A = array of size [3,3]
A[,2] = second column of A
A[2,] = second row of A
Packages - Manual Install
==========================
1. Download packages from: http://cran.ms.unimelb.edu.au/
2. In R, select Packages -> Install Packages from Local Zip file
3. Choose the zip file from the location where it was downloaded.
Packages
=========
sessionInfo() - tells the packages loaded.
library() to see which packages are installed
library(boot) to load the package called boot - gives error if package not found
require(boot) to load the package called boot - gives warning if package not found
library(help = evir) to see the list of functions in the package
search() to see which package are loaded
loadedNamespaces() to see the packages loaded but not on the search list.
<lib>::<func>() to specify overloaded function using its library name to avoid confusion, eg base::Version()
QQ Plots {stats} : qqnorm, qqline, qqplot
importFrom(foo, f, g) loads only the functions f,g from packages foo
import(foo, bar) loads all functions which are exported from the packages foo and bar.
detach("package:R.matlab", unload=TRUE) remove from search() path a data.frame that has been attached or a package that was attached by library.
unloadNamespace("ns") unloads the namespace ns.
unloadNamespace("R.oo")
R.oo::Class()
detach("package:R.methodsS3", unload=TRUE)
removes R.methodsS3 from search() and loadedNamespaces()
Calling a function with package notation, eg copula::genFun
will bring package copula back into loadedNamespaces() but not in search().
unloadNamespace("copula") - will remove it again from loadedNamespaces()
library(copula) - adds copula to search() and loadedNamespaces()
unloadNamespace("copula") - removes it from search() AND loadedNamespaces()
genFun() - calling has no effect
copula::genFun - brings copula into loadedNamespaces()
library(copula) - adds copula to search() and loadedNamespaces()
detach("package:copula") - removes it from search() but NOT REMOVEd from loadedNamespaces()
genFun() - calling has no effect
copula::genFun - is OK
To use packages with Namespaces
- no need to use library() or any other way to load functions
- in the code, simply call [package]::[function] # note the double colon
- when finished, call: unloadNamespace("[package]")
- using this method loads the package in and out of the Namespaces. It has not been added in the search path.
To use packages with NO Namespaces
- call the library by: library([package]). This will load the package into the search path and add to the Namespaces. But since the package has no namespace, nothing is added to the Namespaces.
- in the code, just call the function using its name: [function](arguments)
- when finished, call : detach("[package]:[function]") # note the single colon. This will remove it from the search path. Since it has no Namespace, there is no point in calling: unloadNamespace().
Searching rules with Namespaces
It searches in package namespace first, then among the functions imported by that package, then R base, and lastly in the “normal” search path (as from ‘search()’). See “Writing R Extensions”, last paragraph before the end of sub-section. Consequently, the correct version of ‘set.vertex.attribute’ is used.
Namespace REF: http://www.r-bloggers.com/namespaces-and-name-conflicts/
R - Stat functions
===================
runif - Uniform random generator
rt - Student-T random generator
rexp - Exponential random generator
rweibull - Weibull random generator
rgamma - Gamma random generator
rpareto - Pareto random generator http://www.commanster.eu/rcode.html, http://hosho.ees.hokudai.ac.jp/~kubo/Rdoc/library/rmutil/html/Pareto.html
R - Time Series functions
==========================
This section is taken from http://cran.r-project.org/doc/contrib/Ricci-refcard-ts.pdf
written by Vito Ricci
INPUT
cycle(): gives the positions in the cycle of each observation (stats)
deltat(): returns the time interval between observations (stats)
end(): extracts and encodes the times the last observation were taken (stats)
frequency(): returns the number of samples per unit time (stats)
read.ts(): reads a time series file (tseries)
start(): extracts and encodes the times the first observation were taken (stats)
time(): creates the vector of times at which a time series was sampled (stats)
ts(): creates time-series objects (stats)
window(): is a generic function which extracts the subset of the object 'x' observed between the times 'start' and 'end'. If a frequency is specified, the series is then re-sampled at the new frequency (stats)
TS DECOMPOSITION
decompose(): decomposes a time series into seasonal, trend and irregular components using moving averages. Deals with additive or multiplicative seasonal component (stats)
filter(): linear filtering on a time series (stats)
HoltWinters(): computes Holt-Winters Filtering of a given time series (stats)
sfilter(): removes seasonal fluctuation using a simple moving average (ast)
spectrum(): estimates the spectral density of a time series (stats)
stl(): decomposes a time series into seasonal, trend and irregular components using 'loess' (stats)
tsr(): decomposes a time series into trend, seasonal and irregular. Deals with additive and multiplicative components (ast)
TESTS
adf.test(): computes the Augmented Dickey-Fuller test for the null that 'x' has a unit root (tseries)
Box.test(): computes the Box-Pierce or Ljung-Box test statistic for examining the null hypothesis of independence in a given time series (stats)
bds.test(): computes and prints the BDS test statistic for the null that 'x' is a series of i.i.d. random variables (tseries)
bptest(): performs the Breusch-Pagan test for heteroskedasticity of residuals (lmtest)
dwtest(): performs the Durbin-Watson test for autocorrelation of residuals (lmtest)
jarque.bera.test(): Jarque-Bera test for normality (tseries)
kpss.test(): computes KPSS test for stationarity (tseries)
shapiro.test(): Shapiro-Wilk Normality Test (stats)
STOCHASTIC MODELS
ar(): fits an autoregressive time series model to the data, by default selecting the complexity by AIC (stats)
arima(): fits an ARIMA model to a univariate time series (stats)
arima.sim(): simulate from an ARIMA model (stats)
arma(): fits an ARMA model to a univariate time series by conditional least squares (tseries)
garch(): fits a Generalized Autoregressive Conditional Heteroscedastic GARCH(p, q) time series model to the data by computing the maximum-likelihood estimates of the conditionally normal model (tseries)
GRAPHICS
lag.plot: plots time series against lagged versions of themselves. Helps visualizing "auto-dependence" even
when auto-correlations vanish (stats)
monthplot(): plots a seasonal (or other) subseries of a time series (stats)
plot.ts(): plotting time-series objects (stats)
seaplot(): plotting seasonal sub-series or profile (ast)R functions for time series analysis by Vito Ricci (vito_ricci@yahoo.com) R.0.5 26/11/04
seqplot.ts(): plots a two time series on the same plot frame (tseries)
tsdiag(): a generic function to plot time-series diagnostics (stats)
ts.plot(): plots several time series on a common plot. Unlike 'plot.ts' the series can have a different time bases, but they should have the same frequency (stats)
MISCELLANEOUS
acf(), pacf(), ccf(): the function 'acf' computes (and by default plots) estimates of the autocovariance or autocorrelation function. Function 'pacf' is the function used for the partial autocorrelations. Function 'ccf' computes the cross-correlation or cross-covariance of two univariate series (stats)
diff.ts(): returns suitably lagged and iterated differences (stats)
lag(): computes a lagged version of a time series, shifting the time base back by a given number of observations (stats)
Exploratory Data Analysis
==========================
summary(data) Min, 1stQ, Median, Mean, 3rd Q, Max
fivenum(data) Returns Tukey's five number summary (minimum, lower-hinge, median, upper-hinge, maximum) for the input data.
stem(data) Stem and leaf plot
Plotting on multiple windows
=============================
# Create 3 plots
dev.new() # Or X11()
dev.1 <- as.integer(dev.cur())
dev.new()
dev.2 <- as.integer(dev.cur())
dev.new()
dev.3 <- as.integer(dev.cur())
x <- seq(1, 100, 0.1)
# Switch to device 1
dev.set(dev.1)
plot(x, sin(x), "l")
# Switch to device 3
dev.set(dev.3)
plot(x, cos(x), "l")
# Add something to graph #1
dev.set(dev.1)
points(x, cos(x), "l", col="red")
Writing to HTML files
=======================
This functionality of writing to HTML files are facilitated by either of the following packages:
i) HTMLutils - which depends on R2HTML
ii) hwriter
The functionalities are best described using examples.
The examples here are based on the hwriter package only.
<code>
pHand <- openPage(fileOut, title=title)
hwrite(title, page=pHand, center=TRUE, heading=1)
hwrite('',pHand, br=TRUE)
hwriteImage(fileOut_hist, pHand, br=TRUE)
hwrite('Histogram',pHand, br=TRUE)
closePage(pHand, splash=TRUE)
</code>
The html file is created using the openPage() function and finishes off using closePage() function.
pHand - The file handle. This is needed by closePage as well as other html functions to identify which file to write to.
hwrite() - the function used to write html text. Note there are many arguments in hwrite() to control HTML parameters like: p, br, heading, center, etc.
hwriteImage() - the function to write an image to the html file. In this example, the path of the image is stored in the string fileOut_hist.
List
=====
A list in R can be made up of different data types (numbers, strings) as well as different size objects.
An example is :
Lst <- list(name="Fred", wife="Mary", no.children=3, child.ages=c(4,7,9))
In the example above, the list has got 4 components which are: name, wife, no.children, child.ages
The components can be accessed as Lst$wife or Lst[[2]]
The fourth component is a vector and each elements {4,7,9} can be accessed as:
Lst[[4]][1], Lst[[4]][2], Lst[[4]][3] OR
Lst$child.ages[1], Lst$child.ages[2], Lst$child.ages[3] OR
Lst[["child.ages"]][1], Lst[["child.ages"]][2], Lst[["child.ages"]][3] OR
Creating list: list(name1=obj1, name2=obj2, ....)
Removing elements from a list by index : lll[-c(2,3)], remove second and third elements from the list.
Removing elements from a list by value : lll[lll != 'element']
Extracting elements from a list: lll[c(2,3)] extracting the second and third elements from the list.
Append an item (aaa) to an existing list(lll): append(lll, list(aaa))
Append an item to an 1D array of list
aa <- array(list(), dim=2) # create an array of list items, initially all items are NULL
aa[1] <- c(3.2) # so aa[1,2][[1]] = 3.2
aa[1] <- list(c(aa[1][[1]], 3.4)) # dereference the list, then concatenate, then make into a list again.
Append an item to an array of list
aa <- array(list(), dim=c(2,2)) # create an array of list items, initially all items are NULL
aa[1,2] <- c(3.2) # so aa[1,2][[1]] = 3.2
aa[1,2] <- list(c(aa[1,2][[1]], 3.4)) # dereference the list, then concatenate, then make into a list again.
To transform a list (ll) into just values, with no names:
ull <- unlist(ll)
names(ull) <- NULL
Dynamic reduction of list: This scans through a list, and at the same time removing specific elements of the list.
<code>
while(length(qList) != 0) {
... blah ...
# Update the qList by removing the certain elements, or rather choosing only specific elements to remain.
qList <- qList[qList != as.character(dataSet$xposId)]
... blah ...
} # end while
</code>
Sets Operations
================
is.element(elem, set) - test if element is in the set
union(x, y)
intersect(x, y)
setdiff(x, y)
setequal(x, y)
a & b - element wise comparison, say a is {TRUE, FALSE, TRUE}, b is {FALSE, FALSE, TRUE}, then answer is {FALSE, FALSE, TRUE}
R - functions
===============
The example below best illustrate the structure of a function in R. Here are the things to note.
- the name of the function in the example is myFunction
- the argument list (a,b) are INPUT arguments. They are placed next to the 'function' keyword.
- the output of the function is the most tricky part in R. Essentially, the output must be the last variable assignment in the function. So the last line has the variable 'out'.
- to produce multiple outputs, one way (as illustrated), is to create a list called 'out'. Then at the very last line, aggregate all your desired output variables into the list.
<code>
myFunction <- function(a, b){
out <- NULL # initialize output
c <- a + b
d <- a - b
# Final Result OUTPUT
out <- list(mySum=c, myDiff=d)
}
</code>
When calling the function, assign the function to a variable or handler,
result <- myFunction(2.3, 5.1)
To access the multiple output variables:
result$mySum
result$myDiff
3D plots
=========
The 3D plotting functions are:
image(x,y,z, ...) - heat plot
contour(x,y,z, ...) - contour plot
persp(x,y,z, ...) - wireframe plot
x = {1:rows}, y = {1:cols} are 1D vectors
z[1:rows, 1:cols] is the 2D matrix
3D Cloud plot example:
Data type must be
#str(data )
#'data.frame': 14880 obs. of 4 variables:
# $ x : num [1:14880(1d)] 1 2 3 4 5 6 7 8 9 10 ...
# $ y : num [1:14880(1d)] 1 1 1 1 1 1 1 1 1 1 ...
# $ z : num [1:14880(1d)] 1 1 1 1 1 1 1 1 1 1 ...
# $ val: Factor w/ 3 levels "-1","0","1": 1 1 1 1 1 1 1 2 2 2 ...
The plot function call is like:
<code>
thisPoints = Rows(trellis.par.get("superpose.symbol"), 1:3)
thisPoints$pch <- c(1,3,2)
thisPoints$col <- c("#0080ff", "#ff00ff", "#ffff00")
cloud(cRG ~ cBU * cRT, data = data , groups = val, screen = list(z = 20, x = -70),
perspective = FALSE, zoom = 1.0, pch = thisPoints$pch , col = thisPoints$col,
key = list(title = "Cluster", x = .15, y=.85, corner = c(1,-0.3),
border = TRUE,
points = thisPoints,
text = list(levels(data$val))))
</code>
Parameters like 'screen' are found under ?panel.cloud
show.settings() - will reveal the settings for "superpose.symbol"
?points - to see more options for points parameter
pch - controls the symbols for the points eg 1=circle,2=triangle,3=plus
col - color in Hex code
Plotting features
====================
All the parameters here appear as "plot( .... <param> ....)" unless otherwise specified:
plot(... xlim=c(-1, 10) ...) - specify the limit / boundary on the x-axis
text(x,y,labels,...) - specify text labels in a plot
lines(xvec, yvec) - draw line between 2 points
abline - draw line across whole graph using m,c from y=mx+c
plot(.... type='l' ...) - type: l=lines, p=points
plot(.... las= .....) - orientation of axis labels
plot(.... main='title' ....) - Title for the graph
To control the margin around the plot, specify the margin specification first:
par(oma=c(4,2,2,2))
barplot(.....)
Plotting to PNG files &
Plotting Empirical CDFs
# let vData be a vector of data point
aECDF <- ecdf(vData) # generate a ECDF object called aECDF
summary(aECDF) # summary stats for the ECDF
png("filename.png", bg="transparent")
plot(aECDF, main="title of graph", ylab="ylabel", xlab="xlabel")
dev.off()
# more examples
x10<-seq(1,10)
F10 <- ecdf(x10)
plot(F10)
plot(F10(x10), xlim=c(0,12))
plot(F10, xlim=c(0,12))
plot(x10, F10(x10), xlim=c(0,12), verticals=TRUE)
# another example
xData <- rweibull(100, shape=0.5)+3.0
fdata <- ecdf(xData) # note fdata is a ecdf function
plot(fdata, xlim=c(0,25)) # special plot takes in the ecdf function
plot(xData, fdata(xData), xlim=c(0,25)) # equivalent to using plot on the ecdf function
Adding points to existing plots:
- use the points() function, eg.
points(vecX, vecY, ....)
Plotting 2 or more data in one graph
plot(x), par(new=TRUE), plot(y)
ECDF ecdf Empirical CDF
==========================
aa <- c(1,2,3,4,5)
bb <- c(1,1,2,2,3,3,4,4,5,5)
plot(ecdf(aa))
par(new=TRUE)
plot(ecdf(bb))
### This results in identical ecdf
cc <- c(1,2,2,3,3,4,5)
plot(ecdf(aa))
par(new=TRUE)
plot(ecdf(cc))
### These two plots are different. The gaps are 1.0/7 ~ 0.148
### For the ecdf(cc), the x values are 1,2,3,4,5, the y values are: 0.14, 0.42, 0.71, 0.86, 1.0
dd <- c(2,3,3,2,5,1,4)
plot(ecdf(cc))
par(new=TRUE)
plot(ecdf(dd))
# Sorting has no effect on ecdf(), the data would be sorted by ecdf() anyway.
stepfun()
eg. myfun <- stepfun(1:10, cumsum(c(0, rep(0.1, 10))))
- first argument is x values, second argument starts at zero and is one element longer
- 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 x values
- 0.0, .1, .2, .3, .4, .5, .6, .7, .8, .9, 1.0 y values
stepfun() plot looks identical to ecdf plot, but with step lines drawn.
data.frames
=============
1. Creating data.frames from existing mixed type array.
r1 <- c(2,1, 0.44)
r2 <- c(23,11, 0.424)
dff <- data.frame(rbind(r1, r2))
colnames(dff) <- c("a", "b", "c")
rownames(dff) <- NULL removes the row names in a data frame
where the last row changes the default column names.
2. Creating data.frames from two vectors
age <- c(.....)
height <- c(.....)
village <- data.frame(age=age, height=height)
3. Accessing data.frames
village[1,2]
village["age", "height"]
4. Utility functions
nrow(village)
5. Read in data.frame from input file - see section on "Reading Data Files"
6. To hide, remove the first column which is row numbering, need to use
print(dataframe, row.names=FALSE)
Reading Data Files
====================
1. Reading data from text files into data frame.
Example: given the text data is in the following format:
RG BU RT cMap
1 1 1 -1
2 1 1 -1
3 1 1 -1
1 2 1 -1
2 2 1 -1
- say the data above is in the file called data.txt
- to read the data:
df <- read.table('data.txt', header=TRUE)
- the df structure would automatically be a data.frame.
serialIndepTest
================
1. d <- serialIndepTestSim(n=100, lag.max=3)
2. test <- serialIndepTest(x,d)
3. test
4. dependogram(test,print=TRUE)
Above are the steps to perform the serial independence tests using the serialIndepTest.
Step 1 creates the d object which is the result of simulation, which does not require any input data from the sample to be tested.
Step 2 performs the actual independence test.
Step 3 prints out the results. The Fisher result is better "tends to give the best results and was found to frequently outperform the test based on In".
Step 4. Plots and prints out the results.
Note that in Step 4, it tests all the subsets of lag of the original time series. The subsets are the Mobius decomposition. The original series is independent iff all the subsets are also independent. If any of the vertical line (test statistic) is below the dot(critical value), then that subset cannot reject the null hypothesis of independence. Only when the line is above the dot, then we reject the independence hypothesis for that subset.
Note (n - lag.max) >= 2
"Modeling Multivariate Distributions with Continuous Margins Using the Copula R package", Ivan Kojadinovic, Jun Yan, Journal of Statistical Software, May 2010, Vol 34, Issue 9.
[R] Operators
==============
- Minus, can be unary or binary
+ Plus, can be unary or binary
! Unary not
~ Tilde, used for model formulae, can be either unary or binary
? Help
: Sequence, binary (in model formulae: interaction)
* Multiplication, binary
/ Division, binary
^ Exponentiation, binary
%x% Special binary operators, x can be replaced by any valid name
%% Modulus, binary
%/% Integer divide, binary
%*% Matrix product, binary
%o% Outer product, binary
%x% Kronecker product, binary
%in% Matching operator, binary (in model formulae: nesting)
< Less than, binary
> Greater than, binary
== Equal to, binary
>= Greater than or equal to, binary
<= Less than or equal to, binary
& And, binary, vectorized
&& And, binary, not vectorized
| Or, binary, vectorized
|| Or, binary, not vectorized
<- Left assignment, binary
<<- Assigning to Global / R-Environment variables , see assignOps{base}
-> Right assignment, binary
$ List subset, binary
RUnit - Testing
================
RUnit is a unit testing package that allows each function to be tested automatically.
The following are instructions of how to set it up.
Ref: http://cran.r-project.org/web/packages/RUnit/vignettes/RUnit.pdf
1. Assume there is a function called c2f which needs to be tested. The c2f function can be like:
c2f <- function(c) return (9.5 * c + 32)
2. Create a file called "runitc2f.r" and write the test function as follows:
<code>
test.c2f <- function() {
checkEquals(c2f(0), 32)
checkEquals(c2f(10), 50)
checkException(c2f("xx"))
}
</code>
3. Create another file, with any name, as long as it is loaded using "source" command in R.
Then define the test suite as:
<code>
testsuite.c2f <- defineTestSuite("c2f",
dirs = file.path( <path to your test files> ),
testFileRegexp = "^runit.+\\.r",
testFuncRegexp = "^test.+",
rngKind = "Marsaglia-Multicarry",
rngNormalKind = "Kinderman-Ramage")
</code>
The name of this test suite file and the actual name of the defined testsuite does not matter much. But the most important names are the names of the test file which must be "runit..." and the actual test function names which must be "test....". These restrictions are defined in the definition of the test suite above.
4. If the test suite is defined in the file "testsuite.R", then to run the unit tests:
<code>
source('testsuite.R')
testResult <- runTestSuite(testsuite.c2f)
printTextProtocol(testResult)
</code>
5. To test Exception cases.
- in the actual function, there need to be a catch for the error and a call to the "stop" function
- in the runitXXXX.r file, the checkException is called with the function call as its argument, eg:
checkException( funcName(var1, var2, ...) )
Linear Least Squares Regression
================================
There is a few steps to perform the linear regression. The following example illustrates the various functions to facilitate regression analysis.
Example: Assume variable year, rate
cor(year, rate) - a first check for correlation
suppose we wish to fit the linear model as: rate = slope x year + intercept
fit <- lm(rate ~ year)
fit returns the coefficients for the linear equation such that
fit$coefficients[1] -> intercept
fit$coefficients[2] -> slope
To calculate the residuals:
res <- rate - slope x year + intercept
alternatively,
residuals(fit)
To plot the data and the regression line:
plot(year, rate)
abline(fit)
Finally, a whole lot of information can be obtained by:
summary(fit)
R Interoperability
===================
http://cran.r-project.org/doc/manuals/R-exts.html#Package-structure
http://www.revolutionanalytics.com/why-revolution-r/customer-testimonials.php
http://www.omegahat.org/RwxWidgets/ExampleDocuments/RC++Methods.pdf
using .Call - a souped up version of .C. It provides better ability to use R functions from within C code, and the C code can then be wrapped up to be used in R.
http://www.biostat.jhsph.edu/~bcaffo/statcomp/files/dotCall.pdf
Calling Fortran from R
=========================
http://www.stat.sc.edu/~grego/courses/stat740/handout.pcfortran.pdf
http://www.stat.umn.edu/~charlie/rc/
http://math.acadiau.ca/ACMMaC/howtos/Fortran_R.html
The steps to call Fortran subroutines from R is listed here using simple examples to illustrate the process.
A few notes before beginning:
- R can call Fortran Subroutines easily. Apparently it cannot call Fortran Functions directly. If a Fortran Function is needed, then it should be changed into a subroutine or create a wrapper function (this is a purely Fortran exercise and will not be detailed here).
- To call the Fortran from R, this example uses the .C() R function. Apparently the .Fortran() in R has some instabilities, so it will not be detailed in this example.
1. Create the Fortran subroutine, compile into a dll and expose the subroutines to be exported.
There are many ways and conventions of doing this. However, this example will only show the one that has been tested here and works. This example is based on Intel Fortran 11 on Windows. Also the naming convention to export the subroutine name is STDCALL (there are a few others, but this works here). The sample code is:
<code>
module modFtest
implicit none
contains
subroutine oneIntegerInput_dll(aaa)
!DEC$ ATTRIBUTES DLLEXPORT, STDCALL, ALIAS:'_oneIntegerInput_dll' :: oneIntegerInput_dll
!DEC$ ATTRIBUTES REFERENCE :: aaa
implicit none
integer(4), INTENT(IN) :: aaa
integer(4) :: bbb
bbb=aaa+1
end subroutine oneIntegerInput_dll
end module modFtest
</code>
A few things to be noted:
- The subroutine oneIntegerInput_dll is in a module called modFtest.
- The modFtest module is found in the CallFromR project. When compiled, the output would be a callfromr.dll.
- There can be multiple modules with multiple subroutines, constructed under this CallFromR project which would be accessible from its callfromr.dll
- The Intel Fortran (for Windows) way of exporting symbols is to use the DLLEXPORT attribute and the REFERENCE attribute to declare its variables.
- To be safe, just keep the Fortran variables to primitive data types. And declare all variables as REFERENCE.
- The accessibility of the Fortran variables can still be controlled via the INTENT statements.
2. Loading the DLL from R.
Within R, the dll can be loaded as:
dyn.load("callfromr.dll")
To unload, use:
dyn.unload("callfromr.dll")
If the dll is not in the current directory, the path may need to be provided. Ensure the path separation / or \ is correct for your system.
WARNING: If the dll is still loaded by R, then any attempt to recompile the Fortran code may fail because the dll is being used by R. The dll has to be unloaded before the dll can be recompiled by Fortran.
3. To use the subroutines from the dll, call as follows:
callfromr=.C("oneIntegerInput_dll", as.integer(3))
Note that the subroutine name is case sensitive. Although Fortran itself is not case-sensitive, the exporting of the name using DLLEXPORT has specified the name to be oneIntegerInput_dll with case-sensitivity.
The arguments of the subroutine need to be transformed from R, using as.integer() or something else.
4. Troubleshooting.
To check is a subroutine name has been loaded via the dll, type:
is.loaded("oneIntegerInput_dll")
5. Advanced arguments passing.
The above steps is the first steps to show Fortran - R compatibility. The section here will explore a combination of arguments that can be passed between Fortran and R with expanded examples.
WARNING:
a) In Fortran, DO NOT USE real(4). Instead use real(8)
b) For integers at least, when calling for R, need to cast the integer using as.integer(iii) in the .C(...) argument list.
c) Fortran array arguments must be declared like FCN(N), not FCN(1:*)
A. One integer in, one integer out
integer(4), INTENT(IN) :: aaa
integer(4), INTENT(OUT) :: bbb
The call can be made by:
callfromr=.C("twoIntInOut_dll", as.integer(3), as.integer(0))
The result is that callfromr will be a list of two integers.
B. Double Precision, one in, one out
REAL(8), INTENT(IN) :: aaa
REAL(8), INTENT(OUT) :: bbb
The call can be made by:
callfromr=.C("twoDblInOut_dll", as.double(3.1), as.double(0.0))
C. Vector of double precisions
integer(4), INTENT(IN) :: nn
REAL(8), INTENT(IN) :: aaa(nn)
REAL(8), INTENT(OUT) :: bbb(nn)
Setting up an R example
aaa <- seq(1,5)*1.2 # produces { 1.2 2.4 3.6 4.8 6.0 }
bbb<-seq.int(0, 0,length.out=5) # produces { 0,0,0,0,0 }
The call from R is:
callfromr=.C("vecInOut_dll", as.integer(length(aaa)), as.double(aaa), as.double(bbb))
The result will have 3 items, 1 integer, 1 input vector and 1 output vector.
#### Testing MleFitTruncLN
!DEC$ ATTRIBUTES REFERENCE :: K, L, X, w, parVal, parFlag, stderr, factor , cdf, sList
integer K !input, number of elements in the sample
double precision L !input, the known truncation level
double precision X(1:*) !input, sample data array
double precision w(1:*) !input array of weighting factors
double precision :: parVal(1:*) !output parameter values (mu and sigma)
integer parFlag(1:*) !input, option flag
double precision stderr(1:*) !output stderr(mu) and stderr(sigma)
double precision factor(1:*) !output truncation correction factor, for frequency fitting
double precision cdf(1:K) !output, cdf values at x points
integer :: sList ! error message structure
K <- 100
L <- 2.0
X <- abs(rnorm(K)) + 2*L # rlnorm(K, meanlog = 1.553 , sdlog = 0.117) #
w <- seq(1.0,1.0,length.out=K)
parVal <- c(0.0, 1.0) # mu, sigma - Need INITIAL VALUES
parFlag <- c(0, 0) # correspond to mu, sigma. 0 is the default action. 1 means to not calculate that parameter
stderr <- c(0.0, 0.0)
factor <- c(0.0, 0.0) # Truncation factor
cdf <- seq(0,0,length.out=K)
sList <- 0
callfromr=.C("MleFitTruncLN", as.integer(K), as.double(L), as.double(X), as.double(w), as.double(parVal), as.integer(parFlag), as.double(stderr), as.double(factor), as.double(cdf), as.integer(sList))
#Results
List of 10
$ : int 100
$ : num 2
$ : num [1:100] 4.44 4.24 4.52 4.37 4.97 ...
$ : num [1:100] 1 1 1 1 1 1 1 1 1 1 ...
$ : num [1:2] 1.553 0.117
$ : int [1:2] 0 0
$ : num [1:2] 0.01168 0.00826
$ : num [1:2] 1.00e+02 1.31e-21
$ : num [1:100] 0.297 0.178 0.349 0.25 0.664 ...
$ : int 0
#ECDF plot of original data
xecdf <- ecdf(X)
plot(xecdf)
# Compare with theoretical plot based on parameters found by MLE
xtest <- seq(0,8, length.out=150)
ptest <- plnorm(xtest, meanlog = 1.553 , sdlog = 0.117)
dev.new()
plot(xtest, ptest)
# Compare original and simulated curve by using adk.test
xtest <- rlnorm(150, meanlog = 1.553 , sdlog = 0.117)
#xtest <- rlnorm(150, meanlog = 1.549 , sdlog = 0.112)
adkOut <- adk.test(list(x1=X, x2=xtest))
Fitting Distributions using VGAM::vglm and others
==================================================
To fit a random distribution function, one can use the package VGAM with its vglm function.
An example is:
ylg <- rweibull(5000, 3.2, 1.9)
f3_dist <- VGAM::vglm(formula=ylg ~ 1, family=VGAM::weibull )
In the above example, we create a test random weibull distribution given the parameters (3.2, 1.9)
To use the VGAM::vglm function to find the parameters,
let the formula be Y~1
let the family be VGAM::weibull which is the distribution we want to fit.
Note the output coefficients returned in f3_dist are NOT the parameters.
The parameters are obtained by:
> VGAM::Coef(f3_dist)
shape scale
3.25349663371564 1.90127764909165
> VGAM::summaryvglm(f3_dist)
Call:
VGAM::vglm(formula = ylg ~ 1, family = VGAM::weibull)
Pearson Residuals:
Min 1Q Median 3Q Max
log(shape) -11.108862089551 -0.3482171960797 0.3379254287005 0.6850168947179 0.7955259659881
log(scale) -1.492863009759 -0.7246117432266 -0.2719877869469 0.4750868268133 7.0563972998482
Coefficients:
Value Std. Error t value
(Intercept):1 1.1797303052868 0.011026577908436 106.9897038849
(Intercept):2 0.6425261070716 0.004577189943693 140.3756704388
Number of linear predictors: 2
Names of linear predictors: log(shape), log(scale)
Dispersion Parameter for weibull family: 1
Log-likelihood: -4310.23959834649 on 9998 degrees of freedom
Number of Iterations: 2
Notice that predictors are log(shape), log(scale) and their corresponding values are 1.179, 0.642.
These two values are called the Intercepts because our model is ylg~1
Since they are log, the actual values should be exp(1.179) and exp(0.642) which becomes 3.253 and 1.901.
These last two are consistent with the values from VGAM::Coef which are the parameters for the example
weibull distribution.
Alternative:
library(MASS)
fitdistr(ylg, "weibull")
Fitting poisson with glm:
xSamp <- c(18 15 20 16 29 11 11)
bbb <- glm(xSamp~1, family=poisson())
> summary(bbb)
Call:
glm(formula = xSamp ~ 1, family = poisson())
Deviance Residuals:
Min 1Q Median 3Q Max
-1.588903781006 -1.058919519420 -0.279181657945 0.438723321313 2.603294064982
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.841581593727 0.091287051899 31.12798 < 2.22e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 12.67798701909 on 6 degrees of freedom
Residual deviance: 12.67798701909 on 6 degrees of freedom
AIC: 47.14811314083
Number of Fisher Scoring iterations: 4
To get the fitted Poisson parameter:
1. fitted.values(bbb)
1 2 3 4 5 6 7
17.1428571428641 17.1428571428641 17.1428571428641 17.1428571428641 17.1428571428641 17.1428571428641 17.1428571428641
2. Check with : mean(xSamp)
[1] 17.1428571428571
3. Get value of intercept from summary -> 2.84158
Convert this: exp(2.84158) = 17.1428
Error Handling
===============
Ref: http://mazamascience.com/WorkingWithData/?p=912
The general structure for tryCatch() has the following form:
result = tryCatch({
expr
}, warning = function(w) {
warning-handler-code
}, error = function(e) {
error-handler-code
}, finally {
cleanup-code
}
Alternative is the: try()
Errors:
LoadLibrary failure: %1 is not a valid Win32 application.
Sweave
========
http://www.statistik.lmu.de/~leisch/Sweave/Sweave-manual.pdf
http://www.r-bloggers.com/sweave-tutorial-1-using-sweave-r-and-make-to-generate-a-pdf-of-multiple-choice-questions/
http://www.r-bloggers.com/search/Sweave
http://www.cepe.ethz.ch/education/NPecoHS2010/Sartori-Sweave.pdf
Global variables and scope
===========================
R can handle variables with the same name defined in different scopes.
Example:
aaa <- 2.0 - defined in the interactive environment
go into function func() and print gives
aaa = 2.0
In function func(), define
aaa <- 3.0
Printing gives
aaa = 3.0
Still in func(), rm(aaa) then print(aaa) will give:
aaa=2.0
This behaviour is only valid when aaa is originally a global env variable and the aaa defined in the function is local.
In the function, if aaa is defined as global, eg. aaa <<- 3.0
then this will replace the global definition.
Subscribe to:
Posts (Atom)