Archived
1
0

199 Commits

Author SHA1 Message Date
Bas Vb
cd1637b0fe Both Boiling point and melting point are now parsed from chemical Wikipedia pages, there's one error about different types of attributes in the Result-items, this needs to be fixed by cleaning up the retrieved data. 2014-04-16 00:50:50 +02:00
Bas Vb
1ca3593ae1 Parse is runnable now. 2014-04-16 00:35:19 +02:00
Jip J. Dekker
6799a1a956 Merge branch 'release/v0.1.0' into develop 1-searchable 2014-04-15 19:49:07 +02:00
Jip J. Dekker
2d5e39de81 Merge branch 'release/v0.1.0' v0.1.0 2014-04-15 19:48:55 +02:00
Jip J. Dekker
972e5da0d2 Removed debug code and typos. 2014-04-15 19:48:27 +02:00
Jip J. Dekker
d770f79a7a Bumped version number 2014-04-15 19:46:10 +02:00
Jip J. Dekker
878d8e5efb Merge branch 'feature/CLI' into develop 2014-04-15 19:44:41 +02:00
Jip J. Dekker
61ca2520e3 Added feed export functionality 2014-04-15 19:40:54 +02:00
Jip J. Dekker
e65d3a6898 Added the options for the Feed exports 2014-04-15 18:57:51 +02:00
RTB
8e46762a9e fix: if no experimental data, return predicted acd/labs data instead of None 2014-04-15 18:56:38 +02:00
Jip J. Dekker
ffb3861034 Search for single compound, filename should be lowercase 2014-04-15 18:49:30 +02:00
Jip J. Dekker
91ed053ac5 Stopped log from interfering with STDOUT 2014-04-15 18:17:35 +02:00
Jip J. Dekker
a4dd6e1835 Made logging work 2014-04-14 21:31:20 +02:00
Jip J. Dekker
2ad33080c6 First setup of the CLI, decided on a structure 2014-04-14 20:45:07 +02:00
Jip J. Dekker
ee01e697d3 Added Docopt as an CLI framework 2014-04-14 20:21:41 +02:00
RTB
ff0eb309da ChemSpider parser now handles the Predicted - ACD/Labs tab for scraping properties 2014-04-14 17:27:02 +02:00
RTB
2ae3ac9c51 added parse_properties to scrape the Experimental Physico-chemical Properties table if it exists 2014-04-14 13:09:14 +02:00
RTB
31a63829f8 chemspider parser now makes new synonym requests with the scraped synonyms 2014-04-14 01:23:15 +02:00
RTB
e95df8eaa3 ignore_list now contains the intended names instead of Result objects 2014-04-14 01:20:24 +02:00
RTB
564dbc3292 added ignore list to new_compound_request for synonyms found by chemspider parser 2014-04-14 00:33:25 +02:00
RTB
b1b969a16c corrected usage of __spider variable 2014-04-14 00:28:47 +02:00
RTB
0ad98905e3 added scraping for wikipedia links in synonym tab 2014-04-13 23:35:25 +02:00
RTB
5565c28a1e moved parsing of synonyms to 'parse_synonyms' function 2014-04-13 23:14:23 +02:00
RTB
859a18c61a added parsing of synonyms 2014-04-12 22:27:28 +02:00
RTB
22fa67735d added parse_searchrequest function 2014-04-12 19:41:36 +02:00
RTB
246463b450 simplified debug output, WARNING label should be temporary 2014-04-12 19:19:56 +02:00
RTB
423cb90a6a Merge branch 'develop' into feature/chemspider-parser 2014-04-12 19:13:02 +02:00
RTB
0e3ef9a792 hardcoded ChemSpider API token into ChemSpider.py 2014-04-08 16:14:47 +02:00
Bas Vb
f9799c30d8 Parse is runnable now. 2014-04-08 14:59:09 +02:00
RTB
a4dc8c8711 corrected Chemspider parser to be a subclass of Parser 2014-04-08 13:10:02 +02:00
RTB
0da286c907 created basic structure of ChemSpider search parser 2014-04-08 12:08:45 +02:00
Jip J. Dekker
e10ac12d04 Merge branch 'develop' into feature/Wikipedia 2014-04-08 11:45:23 +02:00
Jip J. Dekker
debbc5e62a Merge branch 'hotfix/none-requests' into develop 2014-04-08 11:44:42 +02:00
Jip J. Dekker
199fa5419e Merge branch 'hotfix/none-requests' 2014-04-08 11:44:26 +02:00
Jip J. Dekker
622dd4ad00 Small fix to ensure unique classes and load all parsers 2014-04-08 11:43:32 +02:00
Jip J. Dekker
da17a149c0 Spider is now able to handle none-request from parsers while handling new
compounds
2014-04-08 11:42:43 +02:00
Jip J. Dekker
4b0c4acf96 Updated the wikipedia parser as an rightful subclass of Parser 2014-04-08 11:40:30 +02:00
Bas Vb
f3807c3018 Fixed the errors, but still not able to run/test the parse() function 2014-04-06 20:28:03 +02:00
Bas Vb
add4a13a4d Trying to make a start with the WikipediaParser, but I can't find out with the Scrapy website (or another way) what the structure of the file should be, and how I can test/run the crawling on a page. 2014-04-06 18:02:09 +02:00
Nout van Deijck
81a93c44bb added author 2014-04-03 12:19:17 +02:00
Bas Vb
60c409da3d New file and branch for the Wikipedia parser 2014-04-03 12:05:06 +02:00
Bas Vb
b4ff4a3c3b New file and branch for the Wikipedia parser 2014-04-03 12:00:27 +02:00
Jip J. Dekker
3a074467e6 Merge branch 'hotfix/No_TABs' into develop 2014-04-02 14:22:13 +02:00
Jip J. Dekker
9805bb5adb Merge branch 'hotfix/No_TABs' 2014-04-02 14:21:34 +02:00
Jip J. Dekker
f6981057df Changed everything to spaces 2014-04-02 14:20:05 +02:00
Jip J. Dekker
595f0253e2 Merge branch 'release/v0.0.1' into develop 2014-04-01 21:44:31 +02:00
Jip J. Dekker
254e8db3aa Merge branch 'release/v0.0.1' v0.0.1 2014-04-01 21:44:08 +02:00
Jip J. Dekker
c9e09f8ab9 Added an version message 2014-04-01 21:42:54 +02:00
Jip J. Dekker
2e8017c590 Merge branch 'feature/parsing-scheme' into develop 2014-04-01 21:40:26 +02:00
Jip J. Dekker
7bc160f676 The spider is now able to start using the synonym generator 2014-04-01 21:38:11 +02:00