Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

Search:

CA249      CA318      CA425      CA651

w2mind.computing.dcu.ie      w2mind.org

Missing
DCU student

CASE3 student Paul Bunbury is missing since Thur 2 Feb 2012.
See appeals on crime.ie and garda.ie and facebook.

He is a great coder. See DCU page and boards.ie page.
He won major coding contests in 2010 and 2011.
He is author of the brilliant "FloodItWorld".
DCU can confirm that in Jan 2012 he passed all 6 modules comfortably.


Lab - stock prices


getprice (stock symbol)
Get the price of that stock.
Usage like: getprice GOOG
  1. Download quote page. Parse to extract price.
  2. See Parsing XML / HTML

  3. Hard to parse: http://bigcharts.marketwatch.com/quickchart/quickchart.asp?symb=SYMBOL
    • grep "Last:" | head -1 | various sed's

  4. Easier to parse: http://finance.yahoo.com/q?s=SYMBOL because stock price is delimited by tags.
    • grep "Last Trade:"
    • grep "yfs_l10_SYMBOL"
    • Something like:
      <span id="yfs_l10_goog">540.30</span>

    • If clean up HTML first to make it well-formed XHTML, can use xpath to parse it properly:
       
      # search for <span tag(s) with attribute id="yfs_l10_goog"  
      cat cleanedupfile.xhtml | xpath '//span[@id="yfs_l10_goog"]'     
      
      # get first one    
      cat cleanedupfile.xhtml | xpath '(//span[@id="yfs_l10_goog"])[1]'      
      
      # get contents    
      cat cleanedupfile.xhtml | xpath '(//span[@id="yfs_l10_goog"])[1]/text()'    > outputfile 
      

  5. In general, remote HTML page may not be written to allow a machine easily find the stock price.
  6. Downside of script - has to be re-written if HTML format changes.
  7. Web scraping issues.


Feeds      HumphrysFamilyTree.com

Bookmark and Share           On Internet since 1987.