Nout van Deijck
|
291547a5ad
|
now returns good results, with property values and corresponding sources
|
2014-06-04 15:44:53 +02:00 |
|
Nout van Deijck
|
ba8f845178
|
now also (finally) scrapes property values and names, but not yet coupled together and not yet returned.
|
2014-06-02 09:26:36 +02:00 |
|
Nout van Deijck
|
8083d0c7bc
|
PubChem scrapes synonyms, gets custom url to get data on properties from
|
2014-05-21 16:11:48 +02:00 |
|
Nout van Deijck
|
fb41d772f2
|
Added custom user-agent because otherwise it would block, because not amused by scraper
|
2014-05-21 16:11:02 +02:00 |
|
Nout van Deijck
|
4b377bb9a9
|
PubChem now scrapes its synonyms
|
2014-05-21 15:25:55 +02:00 |
|
Nout van Deijck
|
84f2e3dbea
|
Testing search function PubChem
|
2014-05-21 14:53:51 +02:00 |
|
Nout van Deijck
|
f728dff6b0
|
Developing PubChem parser, first draft, not tested nor finished completely
|
2014-05-14 12:01:05 +02:00 |
|
Jip J. Dekker
|
d523d4edcd
|
Spelling errors
|
2014-04-23 22:58:04 +02:00 |
|
Jip J. Dekker
|
c5bffffeda
|
Delayed refractor from developing branch
|
2014-04-23 22:55:28 +02:00 |
|
Jip J. Dekker
|
964e0b8ade
|
Merge branch 'develop' into feature/Wikipedia
|
2014-04-23 22:53:28 +02:00 |
|
Nout van Deijck
|
9cbdf57238
|
fixed comments
|
2014-04-23 16:24:27 +02:00 |
|
Nout van Deijck
|
150fc5bea7
|
added comments
|
2014-04-23 16:17:23 +02:00 |
|
Nout van Deijck
|
9cefd336e0
|
Cleaning up code and added log messages
|
2014-04-23 16:02:37 +02:00 |
|
Jip J. Dekker
|
90f03734a6
|
Refractored classname
|
2014-04-23 15:57:10 +02:00 |
|
Jip J. Dekker
|
e18e4b4b26
|
Resolved all references to the old folder
|
2014-04-23 15:55:38 +02:00 |
|
Jip J. Dekker
|
1e24453a11
|
Renamed filename of basic source class
|
2014-04-23 15:51:03 +02:00 |
|
Nout van Deijck
|
507006889b
|
Fixed problem with strange urls, now adds all external identifiers as requests
|
2014-04-23 15:49:23 +02:00 |
|
Jip J. Dekker
|
662ee8f490
|
Renamed folder
|
2014-04-23 15:49:03 +02:00 |
|
Bas Vb
|
62475d965d
|
Cleaning up code
|
2014-04-23 15:24:57 +02:00 |
|
Nout van Deijck
|
3e1b33164e
|
Some comments and trying different for loop for adding requests
|
2014-04-23 13:48:44 +02:00 |
|
Nout van Deijck
|
1ced65e2b6
|
Parser now adds extra requests for every identifier to an external source that is in the Wikipedia chembox
|
2014-04-23 13:18:50 +02:00 |
|
Nout van Deijck
|
b5c83125f7
|
Added extra request for chemspider link retreived from Wikipedia
|
2014-04-23 12:27:53 +02:00 |
|
Bas Vb
|
f926f86d7d
|
Small fix because the cleaned up items were not send back
|
2014-04-23 12:14:20 +02:00 |
|
Nout van Deijck
|
6dd03c293a
|
Added check for already visited redirects of compounds
|
2014-04-23 12:08:33 +02:00 |
|
Bas Vb
|
cb299df96f
|
Added log statements
|
2014-04-23 11:46:43 +02:00 |
|
Bas Vb
|
fd5faf22e4
|
Added empty reliability and condition to prevent errors for now
|
2014-04-23 11:12:58 +02:00 |
|
Bas Vb
|
1c518af5a6
|
Remove per attribute getfunctions
|
2014-04-23 11:06:59 +02:00 |
|
Jip J. Dekker
|
595af7aa32
|
PEP-8 and fixed a bug in set_spider
|
2014-04-22 19:03:29 +02:00 |
|
Jip J. Dekker
|
ba7bed0250
|
Disabled name mangling for the spider reference in the parsers
|
2014-04-22 18:55:14 +02:00 |
|
Jip J. Dekker
|
648b23e466
|
PEP-8 standards for a lot of things
|
2014-04-22 18:54:10 +02:00 |
|
Jip J. Dekker
|
0da2d74e2c
|
PEP-8 indentation for multi-line statements
|
2014-04-22 18:46:49 +02:00 |
|
Jip J. Dekker
|
7a1e99605b
|
Uniform TODO tags, indentation faults.
|
2014-04-22 18:40:14 +02:00 |
|
Bas Vb
|
b0146cdce8
|
Added regular expressions to clean up temperature data
|
2014-04-22 09:46:19 +02:00 |
|
RTB
|
63fb9f4733
|
added comment to parse_searchrequest and added optional todo for extract()[0] usage
|
2014-04-18 17:33:00 +02:00 |
|
RTB
|
3c5dbc44dc
|
added comments for chemspider parse_extendedinfo
|
2014-04-18 17:14:19 +02:00 |
|
RTB
|
2ac6d1711d
|
added comments for chemspider new_synonym
|
2014-04-18 17:11:04 +02:00 |
|
RTB
|
3862bfb7d8
|
added comments for ChemSpider class, parse_properties, and parse_synonyms
|
2014-04-18 16:54:30 +02:00 |
|
RTB
|
f18f23dfc6
|
chemspider new_compound_request is now PEP-8 compliant
|
2014-04-18 16:20:47 +02:00 |
|
RTB
|
074fbdf9e2
|
changed source for properties by parse_extendedinfo to 'ChemSpider ExtendedCompoundInfo'
|
2014-04-18 16:13:44 +02:00 |
|
RTB
|
fa22356cb2
|
chemspider parse_extendedinfo is now PEP-8 compliant
|
2014-04-18 16:12:07 +02:00 |
|
RTB
|
479182d77e
|
chemspider new_synonym is now PEP-8 compliant
|
2014-04-18 16:07:06 +02:00 |
|
RTB
|
319e028717
|
chemspider parse_properties a bit more PEP-8 compliant
|
2014-04-18 15:55:04 +02:00 |
|
RTB
|
9aae8d2d07
|
chemspider parse_properties is now PEP-8 compliant, hopefully
|
2014-04-18 15:38:07 +02:00 |
|
RTB
|
c1c7cfc117
|
edited global strings to be consistent (PEP-8)
|
2014-04-18 15:12:22 +02:00 |
|
RTB
|
f2cacb79eb
|
properties from Predicted - ACD/Labs tab now include conditions from value variable
|
2014-04-18 15:03:06 +02:00 |
|
RTB
|
3bf8dccf18
|
properties from Predicted - ACD/Labs tab now include conditions from attribute variable
|
2014-04-18 14:59:56 +02:00 |
|
RTB
|
cd8a64816f
|
removed colon at end of attributes in Experimental and Predicted ACD/labs tabs
|
2014-04-18 14:10:53 +02:00 |
|
RTB
|
22c765b6e5
|
simplified setting of Results for Predicted ACD/Labs tab
|
2014-04-18 14:08:48 +02:00 |
|
RTB
|
75d248e6cf
|
changed for loop in parse_properties to use zip instead of enumerate
|
2014-04-18 13:45:32 +02:00 |
|
RTB
|
bf4a5bb41f
|
added scraping of synonyms labeled as 'synonym_cn'
|
2014-04-18 13:36:33 +02:00 |
|