Archived
1
0

510 Commits

Author SHA1 Message Date
Bas Vb
f1280dd66d get value not list from xpath 2014-04-16 13:23:50 +02:00
Rob tB
c1b5f810cb unused Result properties are now empty string instead of None 2014-04-16 11:53:59 +02:00
Jip J. Dekker
92a74de9e0 Added the include and exclude options. 2014-04-16 11:17:48 +02:00
Bas Vb
d99548e3b6 Added density, molar entropy and heat capacity 2014-04-16 11:14:02 +02:00
Jip J. Dekker
e0e64bd65a Implemented source exclusion 2014-04-16 11:03:59 +02:00
Jip J. Dekker
d823c105e6 Implemented source inclusion 2014-04-16 10:48:29 +02:00
Bas Vb
d778050f36 Able to parse the weblinks to other databases, one example done 2014-04-16 10:37:57 +02:00
Jip J. Dekker
7b57d86178 Removed redundant source loader 2014-04-16 10:36:46 +02:00
Jip J. Dekker
9dcb150356 Merge branch 'develop' into feature/chemspider-parser 2014-04-16 10:24:52 +02:00
Jip J. Dekker
a06bf643f1 Made sourceloader a class and implemented the listing of all sources 2014-04-16 10:14:29 +02:00
Jip J. Dekker
8b7cfac2de Added an new command to the CLI, implementation will follow. 2014-04-16 09:33:07 +02:00
Bas Vb
cd1637b0fe Both Boiling point and melting point are now parsed from chemical Wikipedia pages, there's one error about different types of attributes in the Result-items, this needs to be fixed by cleaning up the retrieved data. 2014-04-16 00:50:50 +02:00
Bas Vb
1ca3593ae1 Parse is runnable now. 2014-04-16 00:35:19 +02:00
Jip J. Dekker
6799a1a956 Merge branch 'release/v0.1.0' into develop 1-searchable 2014-04-15 19:49:07 +02:00
Jip J. Dekker
2d5e39de81 Merge branch 'release/v0.1.0' v0.1.0 2014-04-15 19:48:55 +02:00
Jip J. Dekker
972e5da0d2 Removed debug code and typos. 2014-04-15 19:48:27 +02:00
Jip J. Dekker
d770f79a7a Bumped version number 2014-04-15 19:46:10 +02:00
Jip J. Dekker
878d8e5efb Merge branch 'feature/CLI' into develop 2014-04-15 19:44:41 +02:00
Jip J. Dekker
61ca2520e3 Added feed export functionality 2014-04-15 19:40:54 +02:00
Jip J. Dekker
e65d3a6898 Added the options for the Feed exports 2014-04-15 18:57:51 +02:00
RTB
8e46762a9e fix: if no experimental data, return predicted acd/labs data instead of None 2014-04-15 18:56:38 +02:00
Jip J. Dekker
ffb3861034 Search for single compound, filename should be lowercase 2014-04-15 18:49:30 +02:00
Jip J. Dekker
91ed053ac5 Stopped log from interfering with STDOUT 2014-04-15 18:17:35 +02:00
Jip J. Dekker
a4dd6e1835 Made logging work 2014-04-14 21:31:20 +02:00
Jip J. Dekker
2ad33080c6 First setup of the CLI, decided on a structure 2014-04-14 20:45:07 +02:00
Jip J. Dekker
ee01e697d3 Added Docopt as an CLI framework 2014-04-14 20:21:41 +02:00
RTB
ff0eb309da ChemSpider parser now handles the Predicted - ACD/Labs tab for scraping properties 2014-04-14 17:27:02 +02:00
RTB
2ae3ac9c51 added parse_properties to scrape the Experimental Physico-chemical Properties table if it exists 2014-04-14 13:09:14 +02:00
RTB
31a63829f8 chemspider parser now makes new synonym requests with the scraped synonyms 2014-04-14 01:23:15 +02:00
RTB
e95df8eaa3 ignore_list now contains the intended names instead of Result objects 2014-04-14 01:20:24 +02:00
RTB
564dbc3292 added ignore list to new_compound_request for synonyms found by chemspider parser 2014-04-14 00:33:25 +02:00
RTB
b1b969a16c corrected usage of __spider variable 2014-04-14 00:28:47 +02:00
RTB
0ad98905e3 added scraping for wikipedia links in synonym tab 2014-04-13 23:35:25 +02:00
RTB
5565c28a1e moved parsing of synonyms to 'parse_synonyms' function 2014-04-13 23:14:23 +02:00
RTB
859a18c61a added parsing of synonyms 2014-04-12 22:27:28 +02:00
RTB
22fa67735d added parse_searchrequest function 2014-04-12 19:41:36 +02:00
RTB
246463b450 simplified debug output, WARNING label should be temporary 2014-04-12 19:19:56 +02:00
RTB
423cb90a6a Merge branch 'develop' into feature/chemspider-parser 2014-04-12 19:13:02 +02:00
RTB
0e3ef9a792 hardcoded ChemSpider API token into ChemSpider.py 2014-04-08 16:14:47 +02:00
Bas Vb
f9799c30d8 Parse is runnable now. 2014-04-08 14:59:09 +02:00
RTB
a4dc8c8711 corrected Chemspider parser to be a subclass of Parser 2014-04-08 13:10:02 +02:00
RTB
0da286c907 created basic structure of ChemSpider search parser 2014-04-08 12:08:45 +02:00
Jip J. Dekker
e10ac12d04 Merge branch 'develop' into feature/Wikipedia 2014-04-08 11:45:23 +02:00
Jip J. Dekker
debbc5e62a Merge branch 'hotfix/none-requests' into develop 2014-04-08 11:44:42 +02:00
Jip J. Dekker
199fa5419e Merge branch 'hotfix/none-requests' 2014-04-08 11:44:26 +02:00
Jip J. Dekker
622dd4ad00 Small fix to ensure unique classes and load all parsers 2014-04-08 11:43:32 +02:00
Jip J. Dekker
da17a149c0 Spider is now able to handle none-request from parsers while handling new
compounds
2014-04-08 11:42:43 +02:00
Jip J. Dekker
4b0c4acf96 Updated the wikipedia parser as an rightful subclass of Parser 2014-04-08 11:40:30 +02:00
Bas Vb
f3807c3018 Fixed the errors, but still not able to run/test the parse() function 2014-04-06 20:28:03 +02:00
Bas Vb
add4a13a4d Trying to make a start with the WikipediaParser, but I can't find out with the Scrapy website (or another way) what the structure of the file should be, and how I can test/run the crawling on a page. 2014-04-06 18:02:09 +02:00