Hackathon Project: Issue reshape of the data from collection eventscsv to collection events

Hello I am watching the video Introduction to GDELT for the MongoDB World Hackathon 22 - Session 1 MongoDB presented by Shane and JoeDrumgoole.

I am trying to replicate what they do in the video but I have on issue when trying to do the reshape of the data from collection eventscsv to collection events.

Running:

gdelttools-master % make reshape

I can see in terminal:

mongosh --quiet --file=gdelt_reshaper.js

But the collection events is not created and no more info is provided.

I have no experience working with make and I don’t know if there is any previous configuration to be done in the Makefile file.

(I don’t make any changes to Makefile, it’s the same than in the repo).

I am using a Mac for the hackathon.

I have just installed make (It was not installed previously):

$ brew install make

And try again

gdelttools-master % make reshape

I would appreciate help to be able to execute the script correctly, thanks in advance.

Hi Manuel,
I reshaped the data passing directly the gdelt_reshaper.js to the mongosh

Welcome

2 Likes

I’d recommend running the gdelt_reshaper.js directly with mongosh, as @Crist recommends.

If you’re not running MongoDB on localhost, and the direct port, you may also want to provide the connection string of your MongoDB cluster to mongosh, like this:

mongosh mongodb+ssh://username:password@your.cluster.com/yourdatabase gdelt_reshaper.js
1 Like

Hi Manuel,

The reshape step assumes you have already loaded some data with the gdeltloader script. If you follow the package steps you should get some output.

pip install gdelttools # install the package
gdeltloader --master --download --overwrite --last 20 # download the last 20 days of data
make full_data_load #load the downloaded data and reshape it

If you do those steps in order you should get some output.

2 Likes

Thank you, I followed your answer with some suggestions from @Mark_Smith, so I will post my next question as a reply to his answer.

Thank you, I’m not running MongoDB on localhost, so I tried providing the connection string of my MongoDB cluster to mongosh

I need to change a bit the URI:

mongosh mongodb+ssh://.....

Results:

MongoshInvalidInputError: [COMMON-10001] Invalid URI:

Using instead this:

mongosh mongodb+srv://.....

Results:

Current Mongosh Log ID: 6279438bdf708b5009da2113
Connecting to:          mongodb+srv://<credentials>@worldhack22.s8ynf.mongodb.net/databasehackaton2022
Using MongoDB:          5.0.8
Using Mongosh:          1.1.8

For mongosh info see: https://docs.mongodb.com/mongodb-shell/


To help improve our products, anonymous usage data is collected and sent to MongoDB periodically (https://www.mongodb.com/legal/privacy-policy).
You can opt-out by running the disableTelemetry() command.

Loading file: gdelt_reshaper.js

But nothings more happens and the collection events is not created.

Thank you, I tried your answer, I already have installed the package gdelttools

The following download the files:

gdeltloader --master --download --overwrite --last 20 # download the last 20 days of data

But I tried the command in both directories (gdelttools-master and gdelttools) with the same result

gdelttools-master % make full_data_load

gdelttools % make full_data_load

Results:

make: *** No rule to make target `full_data_load’. Stop.

I just realized that in the file Makefile there is not full_data_load as you said, instead is full_dataload.

I tried from the directory gdelttools-master and I get the following resut:

gdelttools-master % make full_dataload
python gdelttools/gdeltloader.py --master --download
File “gdelttools/gdeltloader.py”, line 22
parser = argparse.ArgumentParser(epilog=f"Version: {version}\n"

I can see now that the events collection has been created, but on a database called GDELT2 (I had named mine databasehackaton2022), maybe your answer worked correctly, I’ll investigate it again and post the results here, thanks again,

I started from scratch a new project and call the database GDELT2 (I remember now that in the video @Joe_Drumgoole said to do it).

I noticed that this Database name is used in the following scripts:

- gdelt_reshaper.js

- gdeltloader.py

- mongoimport.py

- validator.py

And following your reply the collection events was created successfully, thanks to you and also to @Crist and @Joe_Drumgoole for your help.

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.