RTB
|
bf4a5bb41f
|
added scraping of synonyms labeled as 'synonym_cn'
|
2014-04-18 13:36:33 +02:00 |
|
RTB
|
a4a21f2578
|
changed default reliability from empty string to Unknown as per UML design
|
2014-04-18 13:19:05 +02:00 |
|
RTB
|
9389af99ba
|
removed manual Requests for wikipedia URLs as wikipedia parser handles those through synonyms
|
2014-04-18 13:17:24 +02:00 |
|
RTB
|
ae21fa7c67
|
chemspider now scrapes for reference data on synonyms
|
2014-04-18 13:16:22 +02:00 |
|
RTB
|
119d48890d
|
fixed conditional for emitting synonyms, it compiles again
|
2014-04-18 12:14:54 +02:00 |
|
RTB
|
04751b6670
|
chemspider parser now only emits synonyms labeled as 'English'
|
2014-04-17 22:47:43 +02:00 |
|
RTB
|
ce5eeb56a6
|
added scraping of synonym language
|
2014-04-17 22:37:37 +02:00 |
|
RTB
|
4f2c046c9c
|
rewrote parse_synonyms and new_synonym to use an internal dictionary structure
|
2014-04-17 22:06:45 +02:00 |
|
RTB
|
2e95d35283
|
modified parse_synonyms and new_synonym to include a Selector for future edits
|
2014-04-17 21:30:53 +02:00 |
|
Bas Vb
|
be63315ca2
|
regex
|
2014-04-16 17:01:35 +02:00 |
|
Jip J. Dekker
|
efacc08a3d
|
Merge branch 'develop' into feature/Wikipedia
Conflicts:
Fourmi.py
|
2014-04-16 16:49:03 +02:00 |
|
Bas Vb
|
6f82b117c9
|
new function to clean up the datapoints
|
2014-04-16 16:23:33 +02:00 |
|
Rob tB
|
9a78e186bc
|
chemspider parser now grabs data from ExtendedCompoundInfo() of chemspider API (no units)
|
2014-04-16 16:22:47 +02:00 |
|
Bas Vb
|
74aa446f40
|
minor edits (comments etc.)
|
2014-04-16 15:27:36 +02:00 |
|
Rob tB
|
caf7d3df4e
|
fixed ExtendedCompoundInfo url to have csid parameter instead of query
|
2014-04-16 15:27:10 +02:00 |
|
Bas Vb
|
34c3a8b4d6
|
remove empty data points
|
2014-04-16 15:22:47 +02:00 |
|
Rob tB
|
2d314aee6a
|
created stub to parse ExtendedCompoundInfo from ChemSpider MassSpec API
|
2014-04-16 15:21:33 +02:00 |
|
Rob tB
|
7fc980befe
|
chemspider should now only generate new Requests for wikipedia links from 'expert confirmed' synonyms
|
2014-04-16 15:02:37 +02:00 |
|
Bas Vb
|
ce3105f3c1
|
went to a general loop over all values, this way getting all elements from the Wikipedia infobox (except for those with a colspan, because these mess up)
|
2014-04-16 14:56:32 +02:00 |
|
Rob tB
|
87282fc572
|
new properties in parse_properties now use dictionary syntax
|
2014-04-16 14:26:27 +02:00 |
|
Rob tB
|
93a6f098a9
|
log messages are now DEBUG instead of WARNING
|
2014-04-16 13:28:59 +02:00 |
|
Bas Vb
|
f1280dd66d
|
get value not list from xpath
|
2014-04-16 13:23:50 +02:00 |
|
Rob tB
|
c1b5f810cb
|
unused Result properties are now empty string instead of None
|
2014-04-16 11:53:59 +02:00 |
|
Bas Vb
|
d99548e3b6
|
Added density, molar entropy and heat capacity
|
2014-04-16 11:14:02 +02:00 |
|
Bas Vb
|
d778050f36
|
Able to parse the weblinks to other databases, one example done
|
2014-04-16 10:37:57 +02:00 |
|
Jip J. Dekker
|
9dcb150356
|
Merge branch 'develop' into feature/chemspider-parser
|
2014-04-16 10:24:52 +02:00 |
|
Bas Vb
|
cd1637b0fe
|
Both Boiling point and melting point are now parsed from chemical Wikipedia pages, there's one error about different types of attributes in the Result-items, this needs to be fixed by cleaning up the retrieved data.
|
2014-04-16 00:50:50 +02:00 |
|
Bas Vb
|
1ca3593ae1
|
Parse is runnable now.
|
2014-04-16 00:35:19 +02:00 |
|
Jip J. Dekker
|
61ca2520e3
|
Added feed export functionality
|
2014-04-15 19:40:54 +02:00 |
|
RTB
|
8e46762a9e
|
fix: if no experimental data, return predicted acd/labs data instead of None
|
2014-04-15 18:56:38 +02:00 |
|
Jip J. Dekker
|
ffb3861034
|
Search for single compound, filename should be lowercase
|
2014-04-15 18:49:30 +02:00 |
|
RTB
|
ff0eb309da
|
ChemSpider parser now handles the Predicted - ACD/Labs tab for scraping properties
|
2014-04-14 17:27:02 +02:00 |
|
RTB
|
2ae3ac9c51
|
added parse_properties to scrape the Experimental Physico-chemical Properties table if it exists
|
2014-04-14 13:09:14 +02:00 |
|
RTB
|
31a63829f8
|
chemspider parser now makes new synonym requests with the scraped synonyms
|
2014-04-14 01:23:15 +02:00 |
|
RTB
|
e95df8eaa3
|
ignore_list now contains the intended names instead of Result objects
|
2014-04-14 01:20:24 +02:00 |
|
RTB
|
564dbc3292
|
added ignore list to new_compound_request for synonyms found by chemspider parser
|
2014-04-14 00:33:25 +02:00 |
|
RTB
|
b1b969a16c
|
corrected usage of __spider variable
|
2014-04-14 00:28:47 +02:00 |
|
RTB
|
0ad98905e3
|
added scraping for wikipedia links in synonym tab
|
2014-04-13 23:35:25 +02:00 |
|
RTB
|
5565c28a1e
|
moved parsing of synonyms to 'parse_synonyms' function
|
2014-04-13 23:14:23 +02:00 |
|
RTB
|
859a18c61a
|
added parsing of synonyms
|
2014-04-12 22:27:28 +02:00 |
|
RTB
|
22fa67735d
|
added parse_searchrequest function
|
2014-04-12 19:41:36 +02:00 |
|
RTB
|
246463b450
|
simplified debug output, WARNING label should be temporary
|
2014-04-12 19:19:56 +02:00 |
|
RTB
|
423cb90a6a
|
Merge branch 'develop' into feature/chemspider-parser
|
2014-04-12 19:13:02 +02:00 |
|
RTB
|
0e3ef9a792
|
hardcoded ChemSpider API token into ChemSpider.py
|
2014-04-08 16:14:47 +02:00 |
|
Bas Vb
|
f9799c30d8
|
Parse is runnable now.
|
2014-04-08 14:59:09 +02:00 |
|
RTB
|
a4dc8c8711
|
corrected Chemspider parser to be a subclass of Parser
|
2014-04-08 13:10:02 +02:00 |
|
RTB
|
0da286c907
|
created basic structure of ChemSpider search parser
|
2014-04-08 12:08:45 +02:00 |
|
Jip J. Dekker
|
e10ac12d04
|
Merge branch 'develop' into feature/Wikipedia
|
2014-04-08 11:45:23 +02:00 |
|
Jip J. Dekker
|
da17a149c0
|
Spider is now able to handle none-request from parsers while handling new
compounds
|
2014-04-08 11:42:43 +02:00 |
|
Jip J. Dekker
|
4b0c4acf96
|
Updated the wikipedia parser as an rightful subclass of Parser
|
2014-04-08 11:40:30 +02:00 |
|