Archived
1
0

375 Commits

Author SHA1 Message Date
RTB
22fa67735d added parse_searchrequest function 2014-04-12 19:41:36 +02:00
RTB
246463b450 simplified debug output, WARNING label should be temporary 2014-04-12 19:19:56 +02:00
RTB
423cb90a6a Merge branch 'develop' into feature/chemspider-parser 2014-04-12 19:13:02 +02:00
RTB
0e3ef9a792 hardcoded ChemSpider API token into ChemSpider.py 2014-04-08 16:14:47 +02:00
Bas Vb
f9799c30d8 Parse is runnable now. 2014-04-08 14:59:09 +02:00
RTB
a4dc8c8711 corrected Chemspider parser to be a subclass of Parser 2014-04-08 13:10:02 +02:00
RTB
0da286c907 created basic structure of ChemSpider search parser 2014-04-08 12:08:45 +02:00
Jip J. Dekker
e10ac12d04 Merge branch 'develop' into feature/Wikipedia 2014-04-08 11:45:23 +02:00
Jip J. Dekker
debbc5e62a Merge branch 'hotfix/none-requests' into develop 2014-04-08 11:44:42 +02:00
Jip J. Dekker
199fa5419e Merge branch 'hotfix/none-requests' 2014-04-08 11:44:26 +02:00
Jip J. Dekker
622dd4ad00 Small fix to ensure unique classes and load all parsers 2014-04-08 11:43:32 +02:00
Jip J. Dekker
da17a149c0 Spider is now able to handle none-request from parsers while handling new
compounds
2014-04-08 11:42:43 +02:00
Jip J. Dekker
4b0c4acf96 Updated the wikipedia parser as an rightful subclass of Parser 2014-04-08 11:40:30 +02:00
Bas Vb
f3807c3018 Fixed the errors, but still not able to run/test the parse() function 2014-04-06 20:28:03 +02:00
Bas Vb
add4a13a4d Trying to make a start with the WikipediaParser, but I can't find out with the Scrapy website (or another way) what the structure of the file should be, and how I can test/run the crawling on a page. 2014-04-06 18:02:09 +02:00
Nout van Deijck
81a93c44bb added author 2014-04-03 12:19:17 +02:00
Bas Vb
60c409da3d New file and branch for the Wikipedia parser 2014-04-03 12:05:06 +02:00
Bas Vb
b4ff4a3c3b New file and branch for the Wikipedia parser 2014-04-03 12:00:27 +02:00
Jip J. Dekker
3a074467e6 Merge branch 'hotfix/No_TABs' into develop 2014-04-02 14:22:13 +02:00
Jip J. Dekker
9805bb5adb Merge branch 'hotfix/No_TABs' 2014-04-02 14:21:34 +02:00
Jip J. Dekker
f6981057df Changed everything to spaces 2014-04-02 14:20:05 +02:00
Jip J. Dekker
595f0253e2 Merge branch 'release/v0.0.1' into develop 2014-04-01 21:44:31 +02:00
Jip J. Dekker
254e8db3aa Merge branch 'release/v0.0.1' v0.0.1 2014-04-01 21:44:08 +02:00
Jip J. Dekker
c9e09f8ab9 Added an version message 2014-04-01 21:42:54 +02:00
Jip J. Dekker
2e8017c590 Merge branch 'feature/parsing-scheme' into develop 2014-04-01 21:40:26 +02:00
Jip J. Dekker
7bc160f676 The spider is now able to start using the synonym generator 2014-04-01 21:38:11 +02:00
Jip J. Dekker
cd421cc2fb Replaced literal for testing with a variable fix. 2014-04-01 21:24:04 +02:00
Jip J. Dekker
0bf2d102c6 Fixed parser importation, so it doesn't import imported classes. 2014-04-01 21:21:30 +02:00
Jip J. Dekker
683f8c09d4 Quick fix, python errors 2014-04-01 21:12:54 +02:00
Jip J. Dekker
f93dc2d160 Added an structure to get requests for all websites for a new synonym 2014-04-01 21:07:36 +02:00
Jip J. Dekker
e39ed3b681 Added a way for parsers to access the spider. 2014-04-01 20:56:32 +02:00
Jip J. Dekker
4d9e5307bf Written an loader for all parsers in the parser directory. 2014-03-31 00:48:45 +02:00
Jip J. Dekker
0cc1b23353 Added the functionality to add parsers and automatically use them. 2014-03-30 23:37:42 +02:00
Jip J. Dekker
6e2df64fe4 Merge branch 'hotfix/spider-import-error' into develop 2014-03-30 23:08:14 +02:00
Jip J. Dekker
a6d3d4a716 Merge branch 'hotfix/spider-import-error' spider-import-error 2014-03-30 23:07:52 +02:00
Jip J. Dekker
14c27458fc Fixed an import error 2014-03-30 23:07:28 +02:00
Jip J. Dekker
e0556bbf16 Merge branch 'release/basic-scraper-structure' basic-scraper-structure 2014-03-30 22:16:13 +02:00
Jip J. Dekker
e210ce8558 Merge branch 'develop', remote-tracking branch 'origin/develop' into develop 2014-03-30 22:08:21 +02:00
Jip J. Dekker
6bbee865c4 Merge branch 'feature/basic-structure' into develop 2014-03-28 14:46:43 +01:00
Jip J. Dekker
1e730e77ce Merge branch 'feature/basic-structure' of code.giphouse.nl:giphouse/descartes-2 into feature/basic-structure 2014-03-28 14:44:29 +01:00
Jip J. Dekker
32cedecf2e Added an basic parser class to extend, next step implementing the global function 2014-03-28 14:44:17 +01:00
Jip J. Dekker
325febe834 Added an basic parser class to extend, next step implementing the global function 2014-03-28 14:43:22 +01:00
Jip J. Dekker
d91706d6e5 The script should stop sometime, added a stopping signal 2014-03-28 14:14:39 +01:00
Jip J. Dekker
87d1041517 Made all Python files PEP-8 Compatible 2014-03-28 14:11:36 +01:00
Jip J. Dekker
5b17627504 The parsers however could use their own folder 2014-03-27 13:23:03 +01:00
Jip J. Dekker
8e9314e753 One spider should have it's own folder 2014-03-27 13:18:55 +01:00
Jip J. Dekker
bdcf359da7 Logical fixes to have some "working" case 2014-03-27 13:12:27 +01:00
Jip J. Dekker
8175e02f6c New Structure, splitting on parsers instead of Spiders 2014-03-27 13:08:46 +01:00
Jip J. Dekker
306a37db1a A better structure which is able to start multiple spiders. 2014-03-22 15:48:08 +01:00
Jip J. Dekker
aa65bbd459 Merge branch 'feature/basic-structure' into develop 2014-03-18 18:10:03 +01:00