## Introduction to Scientific Computing
### Lecture 07: Core modules in Python and Excercises
#### J.R. Gladden, Spring 2018, Univ. of Mississippi

There are **many** modules that come with every Python distribution - so called "core modules". These library provide tools for all kinds of tasks such as working with files on the computer, establishing a network connection, get time information, and many other things.

We'll explore the modules
- os
- sys
- time
- glob


The **os** module provides access to information to the local operating system

In [1]:
import os

In [5]:
print(os.getcwd())
print(os.name)

/Users/joshgladden/Box Sync/ScientificComputing/Spring2018/lecs_code
posix


The **sys** module has information about the python system

In [6]:
import sys

In [9]:
print(sys.version)
print('\n')
print(sys.path)

2.7.13 |Enthought, Inc. (x86_64)| (default, Mar 2 2017, 08:20:50) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]


['', '/Users/joshgladden/Library/Enthought/Canopy/edm/envs/User/lib/python27.zip', '/Users/joshgladden/Library/Enthought/Canopy/edm/envs/User/lib/python2.7', '/Users/joshgladden/Library/Enthought/Canopy/edm/envs/User/lib/python2.7/plat-darwin', '/Users/joshgladden/Library/Enthought/Canopy/edm/envs/User/lib/python2.7/plat-mac', '/Users/joshgladden/Library/Enthought/Canopy/edm/envs/User/lib/python2.7/plat-mac/lib-scriptpackages', '/Users/joshgladden/Library/Enthought/Canopy/edm/envs/User/lib/python2.7/lib-tk', '/Users/joshgladden/Library/Enthought/Canopy/edm/envs/User/lib/python2.7/lib-old', '/Users/joshgladden/Library/Enthought/Canopy/edm/envs/User/lib/python2.7/lib-dynload', '/Users/joshgladden/Library/Enthought/Canopy/edm/envs/User/lib/python2.7/site-packages', '/Users/joshgladden/Library/Enthought/Canopy/edm/envs/User/lib/python2.7/site-packages/IPython/extensio

The **time** module provides access to time information.

In [10]:
import time

In [13]:
print(time.time())
print(time.gmtime())
startTime = time.time()
time.sleep(3)
endTime = time.time()
print('The elapsed time is %2.3f seconds'%(endTime-startTime))

1518383221.33
time.struct_time(tm_year=2018, tm_mon=2, tm_mday=11, tm_hour=21, tm_min=7, tm_sec=1, tm_wday=6, tm_yday=42, tm_isdst=0)
The elapsed time is 3.001 seconds


The **glob** module is useful for dealing with directory contents like listing files of a certain type.

In [14]:
import glob

In [19]:
mydatFiles = glob.glob('*.dat')
mydatFiles

['newdata.dat',
 'mynewfile.dat',
 'mydata.dat',
 'testdata2.dat',
 'data.dat',
 'oldfile.dat',
 'speedup_PI.dat',
 'zunzunData.dat',
 'newfile.dat']

**Excercise 1:** Time stamper should do the following:
- Take in a file extension as an argument, and path as optional 2nd argument (default to current directory)
- Read in a list of those files in the current directory
- Add a time stamp to their names
- Make a new directory with the current date as name 
- copy all the files into that directory with the time stamped names.

** Note: ** Because it is not possible to pass command line arguments to the program in a Jupyter notebook, I am just listing the code in the cell below. It'll produce an error if you run it as an iPython notebook. You should save it to a python file (say called 'lec07_timestamper.py' and run it with the %run magic and include the argument (see the cell below)


In [21]:
import os,sys,glob,time,subprocess

origpath=os.getcwd()

args=sys.argv
if len(args) < 2:
 print "A 3 character file extension is required (such as .dat).\n Quitting.."
 sys.exit()
elif len(args) == 2:
 filetype = args[1]
elif len(args) == 3:
 filetype = args[1]
 path = args[2]
 os.chdir(path)
else:
 print "Too many arguments! Usage: 'timestamper.py .dat [ ./somedata/ ]"
 sys.exit()
		
#Add preceeding dot for lazy users
if filetype[0] != '.': filetype = '.'+filetype

#Get local time down to day and make new directory
ct = time.localtime()
mon = ct.tm_mon
day = ct.tm_mday #mday is day of the MONTH (could also use yday or wday for year or week)
year = ct.tm_year
dirname = 'FilesFor_%02i_%02i_%i'%(mon,day,year)

#Note if directory already exists, mkdir() will return an error. 
#This checks if if directory exists and makes it if not
if len(glob.glob(dirname))<1: 
 print "Making a new directory: " + dirname
 os.mkdir(dirname)
else: print "Directory: " + dirname + " already exists."

#Make a list of files and process them in a loop
files = glob.glob('*'+filetype)

for file in files:
 ct = time.localtime()
 hour = ct.tm_hour
 min = ct.tm_min
 insertion = '_%02i_%02i_%i_%02ihr_%02imin'%(mon,day,year,hour,min)
 rootname=file[:-4]
 newname=rootname+insertion+filetype
 subprocess.call(['cp',file,dirname+'/'+newname])
 print "File %s has been renamed %s." % (file,newname)
	
# Change back to original directory
os.chdir(origpath)


OSError: [Errno 20] Not a directory: '/Users/joshgladden/Library/Jupyter/runtime/kernel-334b87b9-483a-4cd9-9b71-3856e61ab290.json'

In [22]:
%run lec07_timestamper.py .png

Making a new directory: FilesFor_02_11_2018
File myplot.png has been renamed myplot_02_11_2018_15hr_19min.png.
File banner_small.png has been renamed banner_small_02_11_2018_15hr_19min.png.
File test.png has been renamed test_02_11_2018_15hr_19min.png.


** Excercise 2: Logistic Growth**
This is a population growth model that links the current population to the population at the previous time step. Time step is indicated by the subscript i here. 

$ x_i = x_{i-1} + \frac{\rho}{100} x_{i-1} \left( 1 - \frac{x_{i-1}}{M} \right)$

I present two methods for coding this. The first uses standard python lists to build up the data and converts them to arrays for plotting. The second creates empty numpy arrays. Not suprsingly the arrays are faster!

In [1]:
%matplotlib wx
import matplotlib.pyplot as plt
import numpy as np
import time
 
def loggrowth(x,rho,M):
 newx=x + rho/(100.)*x*(1-x/float(M))
 return newx

	 
startT=time.time()
rho=1.0
M=1500.0
tsteps=1000
# Make sequences lists
tlist=range(0,tsteps,1)
pop=[100]
for t in tlist[1:]:
 newpop=loggrowth(pop[-1],rho,M)
 pop.append(newpop)

pop=np.array(pop)
times=np.array(tlist)
meth1T = time.time() - startT 
plt.plot(times,pop,'b-',label='Using Lists')

#Another method: make sequences arrays
startT=time.time()
indexes=range(tsteps+1)
pop=np.zeros(len(indexes))
pop[0]=100
for i in indexes[1:]:
 	pop[i] = loggrowth(pop[i-1],rho,M)

meth2T = time.time() - startT 
plt.plot(indexes,pop,'go',label='Using Arrays')
plt.legend(loc=2)

print('Time for method 1: %3.5f ms'% (meth1T*1000.))
print('Time for method 2: %3.5f ms' % (meth2T*1000.))

Time for method 1: 1.10507 ms
Time for method 2: 1.50108 ms


In [31]:
pop.shape

(1000,)