Archived
1
0

95 Commits

Author SHA1 Message Date
Jip J. Dekker
873231439c Merge branch 'develop' into feature/Wikipedia 2014-04-16 16:59:25 +02:00
Jip J. Dekker
d603e388e6 Merge branch 'hotfix/1-searchable' into develop 2014-04-16 16:58:53 +02:00
Jip J. Dekker
ab2a3fdc08 typo! 2014-04-16 16:57:27 +02:00
Jip J. Dekker
f0d10902b5 Searchable can't be a list! 2014-04-16 16:57:08 +02:00
Jip J. Dekker
efacc08a3d Merge branch 'develop' into feature/Wikipedia
Conflicts:
	Fourmi.py
2014-04-16 16:49:03 +02:00
Bas Vb
6f82b117c9 new function to clean up the datapoints 2014-04-16 16:23:33 +02:00
Bas Vb
74aa446f40 minor edits (comments etc.) 2014-04-16 15:27:36 +02:00
Bas Vb
34c3a8b4d6 remove empty data points 2014-04-16 15:22:47 +02:00
Bas Vb
ce3105f3c1 went to a general loop over all values, this way getting all elements from the Wikipedia infobox (except for those with a colspan, because these mess up) 2014-04-16 14:56:32 +02:00
Bas Vb
f1280dd66d get value not list from xpath 2014-04-16 13:23:50 +02:00
Bas Vb
d99548e3b6 Added density, molar entropy and heat capacity 2014-04-16 11:14:02 +02:00
Bas Vb
d778050f36 Able to parse the weblinks to other databases, one example done 2014-04-16 10:37:57 +02:00
Bas Vb
cd1637b0fe Both Boiling point and melting point are now parsed from chemical Wikipedia pages, there's one error about different types of attributes in the Result-items, this needs to be fixed by cleaning up the retrieved data. 2014-04-16 00:50:50 +02:00
Bas Vb
1ca3593ae1 Parse is runnable now. 2014-04-16 00:35:19 +02:00
Jip J. Dekker
6799a1a956 Merge branch 'release/v0.1.0' into develop 1-searchable 2014-04-15 19:49:07 +02:00
Jip J. Dekker
2d5e39de81 Merge branch 'release/v0.1.0' v0.1.0 2014-04-15 19:48:55 +02:00
Jip J. Dekker
972e5da0d2 Removed debug code and typos. 2014-04-15 19:48:27 +02:00
Jip J. Dekker
d770f79a7a Bumped version number 2014-04-15 19:46:10 +02:00
Jip J. Dekker
878d8e5efb Merge branch 'feature/CLI' into develop 2014-04-15 19:44:41 +02:00
Jip J. Dekker
61ca2520e3 Added feed export functionality 2014-04-15 19:40:54 +02:00
Jip J. Dekker
e65d3a6898 Added the options for the Feed exports 2014-04-15 18:57:51 +02:00
Jip J. Dekker
ffb3861034 Search for single compound, filename should be lowercase 2014-04-15 18:49:30 +02:00
Jip J. Dekker
91ed053ac5 Stopped log from interfering with STDOUT 2014-04-15 18:17:35 +02:00
Jip J. Dekker
a4dd6e1835 Made logging work 2014-04-14 21:31:20 +02:00
Jip J. Dekker
2ad33080c6 First setup of the CLI, decided on a structure 2014-04-14 20:45:07 +02:00
Jip J. Dekker
ee01e697d3 Added Docopt as an CLI framework 2014-04-14 20:21:41 +02:00
Bas Vb
f9799c30d8 Parse is runnable now. 2014-04-08 14:59:09 +02:00
Jip J. Dekker
e10ac12d04 Merge branch 'develop' into feature/Wikipedia 2014-04-08 11:45:23 +02:00
Jip J. Dekker
debbc5e62a Merge branch 'hotfix/none-requests' into develop 2014-04-08 11:44:42 +02:00
Jip J. Dekker
199fa5419e Merge branch 'hotfix/none-requests' 2014-04-08 11:44:26 +02:00
Jip J. Dekker
622dd4ad00 Small fix to ensure unique classes and load all parsers 2014-04-08 11:43:32 +02:00
Jip J. Dekker
da17a149c0 Spider is now able to handle none-request from parsers while handling new
compounds
2014-04-08 11:42:43 +02:00
Jip J. Dekker
4b0c4acf96 Updated the wikipedia parser as an rightful subclass of Parser 2014-04-08 11:40:30 +02:00
Bas Vb
f3807c3018 Fixed the errors, but still not able to run/test the parse() function 2014-04-06 20:28:03 +02:00
Bas Vb
add4a13a4d Trying to make a start with the WikipediaParser, but I can't find out with the Scrapy website (or another way) what the structure of the file should be, and how I can test/run the crawling on a page. 2014-04-06 18:02:09 +02:00
Nout van Deijck
81a93c44bb added author 2014-04-03 12:19:17 +02:00
Bas Vb
60c409da3d New file and branch for the Wikipedia parser 2014-04-03 12:05:06 +02:00
Bas Vb
b4ff4a3c3b New file and branch for the Wikipedia parser 2014-04-03 12:00:27 +02:00
Jip J. Dekker
3a074467e6 Merge branch 'hotfix/No_TABs' into develop 2014-04-02 14:22:13 +02:00
Jip J. Dekker
9805bb5adb Merge branch 'hotfix/No_TABs' 2014-04-02 14:21:34 +02:00
Jip J. Dekker
f6981057df Changed everything to spaces 2014-04-02 14:20:05 +02:00
Jip J. Dekker
595f0253e2 Merge branch 'release/v0.0.1' into develop 2014-04-01 21:44:31 +02:00
Jip J. Dekker
254e8db3aa Merge branch 'release/v0.0.1' v0.0.1 2014-04-01 21:44:08 +02:00
Jip J. Dekker
c9e09f8ab9 Added an version message 2014-04-01 21:42:54 +02:00
Jip J. Dekker
2e8017c590 Merge branch 'feature/parsing-scheme' into develop 2014-04-01 21:40:26 +02:00
Jip J. Dekker
7bc160f676 The spider is now able to start using the synonym generator 2014-04-01 21:38:11 +02:00
Jip J. Dekker
cd421cc2fb Replaced literal for testing with a variable fix. 2014-04-01 21:24:04 +02:00
Jip J. Dekker
0bf2d102c6 Fixed parser importation, so it doesn't import imported classes. 2014-04-01 21:21:30 +02:00
Jip J. Dekker
683f8c09d4 Quick fix, python errors 2014-04-01 21:12:54 +02:00
Jip J. Dekker
f93dc2d160 Added an structure to get requests for all websites for a new synonym 2014-04-01 21:07:36 +02:00