Comments (5)
Could you please post the output of the following commands:
select * from release_artist where artist_id = '0'
(might bewhere artist_id = 0
)select * from release_track_artist_fk where artist_id = '118760'
select * from release_company where company_id = '670384'
from discogs-xml2db.
Hi, I just imported discogs_20200901 and had similar issues, I believe they are db problems, not script problems.
For you requested more info, using pgadmin4 against discogs_20200901 imported via discogs-xml2db, I got the following results:
select count(*) from public.release_artist where artist_id = 0;
returns 243416
select count(*) from release_track_artist where artist_id = '118760' returns 64110
select * from release_company where company_id = '670384' returns one record with what looks like valid entries in each field.
from discogs-xml2db.
Thank you, @MarkInNVA - could you post some examples of results for each one of those selects?
from discogs-xml2db.
Artist 0 referred to in release_artist seems to be release credits that dont match to an artist in discogs database.
e.g
jthinksearch=# select * from release_artist where release_id=107;
id | release_id | artist_id | artist_name | extra | anv | position | join_string | role | tracks
-----+------------+-----------+------------------+-------+-----+----------+-------------+----------------------+--------
393 | 107 | 202 | Kid Scientific | 0 | | 1 | | |
394 | 107 | 6871553 | Fran Englehardt | 1 | | 1 | | Executive Producer |
395 | 107 | 0 | Matthew Sordillo | 1 | | 2 | | Executive Producer |
396 | 107 | 147370 | Mike Walsh | 1 | | 3 | | Executive Producer |
397 | 107 | 0 | Renae Loguidice | 1 | | 4 | | Photography |
398 | 107 | 202 | Kid Scientific | 1 | | 5 | | Producer, Written-By |
399 | 107 | 283613 | Skydiver (5) | 1 | | 6 | | Producer, Written-By |
If we look at https://www.discogs.com/release/107 we can see that the two artists with artist_id of zero (Matthew Sordillo and Renae Loguidice) are listed on webpage, but they are not hyperlinks
Not sure what do about this.
Artist 118760 referred to release_track_artist is the meta artist 'No Artist', but i think it makes sense for our database to add this an artist
select * from release_track_artist where artist_id = 118760
200 | 267 | 56 | 1 | 118760 | No Artist | f | | 1 | | |
833 | 1387 | 319 | 3 | 118760 | No Artist | f | | 1 | | |
5794 | 7393 | 1241 | 1 | 118760 | No Artist | f | | 1 | | |
12916 | 16576 | 2964 | 14 | 118760 | No Artist | f | | 1 | | |
13760 | 17546 | 3122 | 1 | 118760 | No Artist | f | | 1 | |
e.g see first track on this release
https://www.discogs.com/release/56-Dino-Terry-House-De-Luxe-Volume-2
from discogs-xml2db.
Tried adding
insert into artist(id,name) values(118760, 'No Artist');
but release_track_artist_fk_artist fails because there are other rows that use artist_id=0 just like release_artist does
Maybe release_artist and release_track_artist should both allow artist_id to be nullable ?
from discogs-xml2db.
Related Issues (20)
- get_latest_dumps.sh should checksum the downloaded files
- Allow specifying direct files rather than `--export` + input dir
- Investigate the opportunities of a .NET-based parser HOT 4
- ImportError: cannot import name zip_longest HOT 4
- Export artist groups
- Export sublabels
- C# csv files should not have BOM
- Use python virtual environment over sudo HOT 2
- run.py failing on parsing discogs_20200901_releases.xml.gz, maybe an error in xmlfile HOT 3
- get_latest_dumps should checksum the downloads HOT 1
- Put Postgres import and sql commands into a single script HOT 3
- Hexadecimal issue with experimental C Parser HOT 2
- Type in postgressql/sq/CreateTables.sql HOT 2
- Change CSV delimiter and quotes HOT 1
- Issues while importing the November 2020 dump to PostgreSQL HOT 4
- C# exports gz, neither of which are compatible with mysql importer HOT 1
- release_artist missing data for tracks column HOT 1
- get_latest_dumps.sh broken by URL changes
- Incorrect progress bar percentage
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from discogs-xml2db.