Fourmi

Archived

Author	SHA1	Message	Date
Nout van Deijck	1ced65e2b6	Parser now adds extra requests for every identifier to an external source that is in the Wikipedia chembox	2014-04-23 13:18:50 +02:00
Nout van Deijck	b5c83125f7	Added extra request for chemspider link retreived from Wikipedia	2014-04-23 12:27:53 +02:00
Bas Vb	f926f86d7d	Small fix because the cleaned up items were not send back	2014-04-23 12:14:20 +02:00
Nout van Deijck	6dd03c293a	Added check for already visited redirects of compounds	2014-04-23 12:08:33 +02:00
Bas Vb	cb299df96f	Added log statements	2014-04-23 11:46:43 +02:00
Bas Vb	fd5faf22e4	Added empty reliability and condition to prevent errors for now	2014-04-23 11:12:58 +02:00
Bas Vb	1c518af5a6	Remove per attribute getfunctions	2014-04-23 11:06:59 +02:00
Bas Vb	b0146cdce8	Added regular expressions to clean up temperature data	2014-04-22 09:46:19 +02:00
Bas Vb	be63315ca2	regex	2014-04-16 17:01:35 +02:00
Jip J. Dekker	efacc08a3d	Merge branch 'develop' into feature/Wikipedia Conflicts: Fourmi.py	2014-04-16 16:49:03 +02:00
Bas Vb	6f82b117c9	new function to clean up the datapoints	2014-04-16 16:23:33 +02:00
Bas Vb	74aa446f40	minor edits (comments etc.)	2014-04-16 15:27:36 +02:00
Bas Vb	34c3a8b4d6	remove empty data points	2014-04-16 15:22:47 +02:00
Bas Vb	ce3105f3c1	went to a general loop over all values, this way getting all elements from the Wikipedia infobox (except for those with a colspan, because these mess up)	2014-04-16 14:56:32 +02:00
Bas Vb	f1280dd66d	get value not list from xpath	2014-04-16 13:23:50 +02:00
Bas Vb	d99548e3b6	Added density, molar entropy and heat capacity	2014-04-16 11:14:02 +02:00
Bas Vb	d778050f36	Able to parse the weblinks to other databases, one example done	2014-04-16 10:37:57 +02:00
Bas Vb	cd1637b0fe	Both Boiling point and melting point are now parsed from chemical Wikipedia pages, there's one error about different types of attributes in the Result-items, this needs to be fixed by cleaning up the retrieved data.	2014-04-16 00:50:50 +02:00
Bas Vb	1ca3593ae1	Parse is runnable now.	2014-04-16 00:35:19 +02:00
Jip J. Dekker	61ca2520e3	Added feed export functionality	2014-04-15 19:40:54 +02:00
Jip J. Dekker	ffb3861034	Search for single compound, filename should be lowercase	2014-04-15 18:49:30 +02:00
Bas Vb	f9799c30d8	Parse is runnable now.	2014-04-08 14:59:09 +02:00
Jip J. Dekker	e10ac12d04	Merge branch 'develop' into feature/Wikipedia	2014-04-08 11:45:23 +02:00
Jip J. Dekker	da17a149c0	Spider is now able to handle none-request from parsers while handling new compounds	2014-04-08 11:42:43 +02:00
Jip J. Dekker	4b0c4acf96	Updated the wikipedia parser as an rightful subclass of Parser	2014-04-08 11:40:30 +02:00
Bas Vb	f3807c3018	Fixed the errors, but still not able to run/test the parse() function	2014-04-06 20:28:03 +02:00
Bas Vb	add4a13a4d	Trying to make a start with the WikipediaParser, but I can't find out with the Scrapy website (or another way) what the structure of the file should be, and how I can test/run the crawling on a page.	2014-04-06 18:02:09 +02:00
Nout van Deijck	81a93c44bb	added author	2014-04-03 12:19:17 +02:00
Bas Vb	60c409da3d	New file and branch for the Wikipedia parser	2014-04-03 12:05:06 +02:00
Bas Vb	b4ff4a3c3b	New file and branch for the Wikipedia parser	2014-04-03 12:00:27 +02:00
Jip J. Dekker	f6981057df	Changed everything to spaces	2014-04-02 14:20:05 +02:00
Jip J. Dekker	7bc160f676	The spider is now able to start using the synonym generator	2014-04-01 21:38:11 +02:00
Jip J. Dekker	683f8c09d4	Quick fix, python errors	2014-04-01 21:12:54 +02:00
Jip J. Dekker	f93dc2d160	Added an structure to get requests for all websites for a new synonym	2014-04-01 21:07:36 +02:00
Jip J. Dekker	e39ed3b681	Added a way for parsers to access the spider.	2014-04-01 20:56:32 +02:00
Jip J. Dekker	4d9e5307bf	Written an loader for all parsers in the parser directory.	2014-03-31 00:48:45 +02:00
Jip J. Dekker	0cc1b23353	Added the functionality to add parsers and automatically use them.	2014-03-30 23:37:42 +02:00
Jip J. Dekker	14c27458fc	Fixed an import error	2014-03-30 23:07:28 +02:00
Jip J. Dekker	32cedecf2e	Added an basic parser class to extend, next step implementing the global function	2014-03-28 14:44:17 +01:00
Jip J. Dekker	87d1041517	Made all Python files PEP-8 Compatible	2014-03-28 14:11:36 +01:00
Jip J. Dekker	5b17627504	The parsers however could use their own folder	2014-03-27 13:23:03 +01:00
Jip J. Dekker	8e9314e753	One spider should have it's own folder	2014-03-27 13:18:55 +01:00
Jip J. Dekker	8175e02f6c	New Structure, splitting on parsers instead of Spiders	2014-03-27 13:08:46 +01:00
Jip J. Dekker	b1840d3a65	Another name change to accommodate an executable script	2014-03-18 17:44:32 +01:00

44 Commits