Bas Vb
|
d778050f36
|
Able to parse the weblinks to other databases, one example done
|
2014-04-16 10:37:57 +02:00 |
|
Bas Vb
|
cd1637b0fe
|
Both Boiling point and melting point are now parsed from chemical Wikipedia pages, there's one error about different types of attributes in the Result-items, this needs to be fixed by cleaning up the retrieved data.
|
2014-04-16 00:50:50 +02:00 |
|
Bas Vb
|
1ca3593ae1
|
Parse is runnable now.
|
2014-04-16 00:35:19 +02:00 |
|
Bas Vb
|
f9799c30d8
|
Parse is runnable now.
|
2014-04-08 14:59:09 +02:00 |
|
Jip J. Dekker
|
e10ac12d04
|
Merge branch 'develop' into feature/Wikipedia
|
2014-04-08 11:45:23 +02:00 |
|
Jip J. Dekker
|
da17a149c0
|
Spider is now able to handle none-request from parsers while handling new
compounds
|
2014-04-08 11:42:43 +02:00 |
|
Jip J. Dekker
|
4b0c4acf96
|
Updated the wikipedia parser as an rightful subclass of Parser
|
2014-04-08 11:40:30 +02:00 |
|
Bas Vb
|
f3807c3018
|
Fixed the errors, but still not able to run/test the parse() function
|
2014-04-06 20:28:03 +02:00 |
|
Bas Vb
|
add4a13a4d
|
Trying to make a start with the WikipediaParser, but I can't find out with the Scrapy website (or another way) what the structure of the file should be, and how I can test/run the crawling on a page.
|
2014-04-06 18:02:09 +02:00 |
|
Nout van Deijck
|
81a93c44bb
|
added author
|
2014-04-03 12:19:17 +02:00 |
|
Bas Vb
|
60c409da3d
|
New file and branch for the Wikipedia parser
|
2014-04-03 12:05:06 +02:00 |
|
Bas Vb
|
b4ff4a3c3b
|
New file and branch for the Wikipedia parser
|
2014-04-03 12:00:27 +02:00 |
|
Jip J. Dekker
|
f6981057df
|
Changed everything to spaces
|
2014-04-02 14:20:05 +02:00 |
|
Jip J. Dekker
|
7bc160f676
|
The spider is now able to start using the synonym generator
|
2014-04-01 21:38:11 +02:00 |
|
Jip J. Dekker
|
683f8c09d4
|
Quick fix, python errors
|
2014-04-01 21:12:54 +02:00 |
|
Jip J. Dekker
|
f93dc2d160
|
Added an structure to get requests for all websites for a new synonym
|
2014-04-01 21:07:36 +02:00 |
|
Jip J. Dekker
|
e39ed3b681
|
Added a way for parsers to access the spider.
|
2014-04-01 20:56:32 +02:00 |
|
Jip J. Dekker
|
4d9e5307bf
|
Written an loader for all parsers in the parser directory.
|
2014-03-31 00:48:45 +02:00 |
|
Jip J. Dekker
|
0cc1b23353
|
Added the functionality to add parsers and automatically use them.
|
2014-03-30 23:37:42 +02:00 |
|
Jip J. Dekker
|
14c27458fc
|
Fixed an import error
|
2014-03-30 23:07:28 +02:00 |
|
Jip J. Dekker
|
32cedecf2e
|
Added an basic parser class to extend, next step implementing the global function
|
2014-03-28 14:44:17 +01:00 |
|
Jip J. Dekker
|
87d1041517
|
Made all Python files PEP-8 Compatible
|
2014-03-28 14:11:36 +01:00 |
|
Jip J. Dekker
|
5b17627504
|
The parsers however could use their own folder
|
2014-03-27 13:23:03 +01:00 |
|
Jip J. Dekker
|
8e9314e753
|
One spider should have it's own folder
|
2014-03-27 13:18:55 +01:00 |
|
Jip J. Dekker
|
8175e02f6c
|
New Structure, splitting on parsers instead of Spiders
|
2014-03-27 13:08:46 +01:00 |
|
Jip J. Dekker
|
b1840d3a65
|
Another name change to accommodate an executable script
|
2014-03-18 17:44:32 +01:00 |
|