View Full Version : Music library for extensive collections
darighteous
2008-01-20, 15:11
Holá,
first my problem: I do have a considerable collection of mp3s (>25000) and so am not able to use the library functionality. All of my mp3s are well tagged, categorized and sorted into folders and so I am using/have to use the simple approach by navigating through my folders in search for what I am looking for. It would be nice to see and use a simple visualization of all my CD covers and skim through the collection for getting some inspiration, rather then scanning folder after folder.
So the idea I had: Add an option to Setting/Music/Library to change to database to "Extensive Music Collection"
In this mode the library will collect all the usual informations of an album but without the single track infos. So the library is searchable by artist, album, year, genre, last added, etc... but not by track. This would reduce the amount of records needed drastically. When switching the lib from Extensive <> Normal Database a warning should make clear that any existing music database will be erased and this option is only neccessary for collections with more than 10.000 files.
As nearly everybody will usa a total different file and folder structures, one way could be to determine the neccessary information by using regular expressions (editable in the settings) scanning the folder and file structure (e.g. [%artist\%album\%tnumber - %track] or [%artist - %year - %album\%tnumber - %artist - %track] where %track will always be ignored).
The more logical and easier way would be to use the first two or three mp3s when scanning a folder. When %artist and %album of these files match, the album information will be stored from what's inside the id3 tag. When just %album is a match, this seems to be a compilation and the artist will be "Various", rest of info will be used from id3 tag. When wether %album nor %artist matches, this is a collection of mp3 and the folder will not be added to the collection.
These kind of collections could - maybe in the future - added manually to the library. As some albums might be wrongly categorized (maybe someone used "Band feat. Solo Singer" as artist name and the artist would fall under the Various definition) all the data should be editable. Also, this is something for future releases.
This way people with extensive music collections could use the benefits from the library functionality as well, as the saved records would be reduced to 1000-2500 entries for 25.000 songs (estimating 10-20 tracks per album and some sort of misc folders collecting single files).
An example: I will browse through genre/jazz in 3D List view and choose a CD from there, the view will change to the track view (pressing A will open the folder and read and show the tracks in the folder) or be added to the playlist (Y). It's basically like browsing smb-shares, but with nicer visuals (e.g. the covers) and sortable by year, genre or even searchable by artist or album name.
So, Ladies and Gentlemen, what is your opinion on this?
jmarshall
2008-01-20, 22:07
Why doesn't it work with 25k songs? I don't see any reason it wouldn't work - I know at least 15k songs works perfectly.
ultrabrutal
2008-01-20, 22:39
I don't have that many songs. I only own about 150 albums, but if speed is an issue it sounds like a good idea... However a database should be able to handle far more data than 25.000 rows
darighteous
2008-01-21, 13:08
I don't have that many songs. I only own about 150 albums, but if speed is an issue it sounds like a good idea... However a database should be able to handle far more data than 25.000 rows
Holá, I think to remember that this might not be a limit to the database, but there are certain limitations due to the memory of the xbox. When scanning the content of my NAS to the XBOX till date it froze at some point, so I thought this had something to do with the sheer size of the collection, the collected data and the limitations of the memory. If you are saying, that the database will be scanned "on the fly" and that memory is of no issue, then I have to give it one more try, I guess...
gamerzhaven81
2008-01-21, 18:28
I have a music library of somewhere around 30,000 albums and I use the music library just fine with all of them.
Cheekyboy
2008-09-18, 20:30
I have tried various numbers of tracks/albums (using SQLite to generate dummy data) and have found that, using the library song view, the xbox runs out of memory long before you hit 50,000 tracks. On the Mac (1gb memory) its pretty much unusable at 170,000 tracks. At that level xbmc is using about 750Mb of real memory and 1.2Gb virtual and takes over 60 seconds to open the view. Obviously NOT opening the view is pretty much the only option without adding more memory to the system. I have found searches to work pretty quickly though so its still worth scanning in all your music.
Maybe the song view would benifit from chunking the data, particularly on the xbox where it just crashes the box? I'd be happy to test any fixes
Gamester17
2008-09-18, 21:20
Just how many people do you actually think have more than 50,000 tracks? ???
...and if they have such an extensive connection why run XBMC on a computer?
By the way, know the SQLite code speed have been significantly improved recently.
the songs node is rather unwieldy for large collections and i have contemplating what could be done about this for a long time. on the xbox, the data is actually pulled out of the database in chunks which allows the xbox to even handle that many songs. the only way to allow more songs is to break up the display. i've thought of several ways to do this, and they each have their pro's and con's and its somewhat dependent on how and why the user uses the songs node.
one way is to breaking up the songs node into sub-nodes by the first character of the song title. thus each node has fewer songs in each and requires less memory. the obvious con to this approach is that you dont see *all* songs at once. and this is already possible today with smart playlists.
another way is to paginate the display so that only 1000 songs are ever displayed. the cons here are all technical. first it would be very very difficult to make it a "live" scrolling display. the obvious work around is to use "previous page" and "next page" items at the top and bottom. but there are still other technical changes related to the sorting. the first change is that the sorting would have to be done by the database query which is not done today. (hmm, im curious if this would be faster or slower.) then there's the complication of figuring out what page the user is on if they change the sorting.
another option is to remove the songs node altogether. its value is questionable.
Gamester17
2008-09-18, 23:27
@kraqh3d, whatever you do, please do not make any pro changes for the minority niche users that are con for the majority of users
http://xbmc.org/about/vision/Apply the Law of Diminishing Return - The majority of the effort should be invested in implementing features which have the most benefit and widest general usage by the community.
i never said i planned on making any changes, just that i've thought about this for a long time since the music library started out as my baby :)
** edit **
though, i dont understand the original "problem." the genre, artist, and album nodes should work just fine regardless of how many songs there are. and a simple "hide songs node" option to prevent accidentally entering that node would trivial, as would a setting that prevents the search function from finding songs.
Cheekyboy
2008-09-19, 01:41
I'm sorry but a statement like "how many users do you know with a collection grater than 50,000 songs" is comparable with "who's ever going to need more than 640k of memory" or "who's going to be around in the year 2000"!!
The point is that "too many" songs means that xbmc pretty much becomes unusable from a songs point of view. Which in reality is just when you would want something to help you find that track. Just look at those music juke boxes in pubs where they have access to millions of track, When you want to find something you want it to happen at a reasonable speed.
To be fair the searching works pretty quickly even on the lowly xbox with its paltry 64Meg of memory (but unfortunately it still stacks with out of memory in songs view and struggles with more than about 13,000 artists so the problem is beginning to creep) but the shock for me was that jumping to 1Gb of memory and a dual core processor didnt actually make a significant improvement. Yes you can now open the song view (just) but you sure wouldnt want to. I'm a SQL server/MSAccess developer and know that 170,000 rows is not a problem in data terms so I'm just interested to find out exactly where things are gong wrong and how they can be fixed. I dont know much about SQLite but I am sure the earlier versions of xbmc used to page the data but I dont know if that was using database cursors or some in memory method.
At the end of the day though when a 49Mb, not very relational, database file translates to a view thats taking 750Mb of memory (forgetting about the 1.2Gb of virtual for the minute) then something cant be quite right. I'm more than happy to do what I can to contribute here but at the end of the day its got to first be recognised as an issue.
Kraqh3d, I'm more than happy to trial anything you wish to put forward as a solution. As you say a huge list of songs is unwieldy but at the end of the day thats pretty much what it will always be but as long as you can enter the view and then search/filter without too much delay then thats all you could really hope for. I guess you could get fancy and allow users to build up a navigatable "nodes tree" using auto/user defined search criteria but how successful that might be I am not sure.
Anyway, happy to help and I'm sure quite a lot of people would be please with anything that stops our beloved xboxs wheezing to an early grave :)
Gamester17
2008-09-19, 18:27
Yeah I understand what you are saying and I hope that some SQL guru will come along to optimize all databases in XBMC making it better for everyone, all I was trying to express is that if we can avoid it then we should never sacrifice the XBMC experience of the many to better the XBMC experience for the few.
The masses rule, without the masses there will be no future for XBMC. We do not want to make XBMC into a niche product, our vision is quite the contrary, our aim is that XBMC should become the overall best media center on all platforms, and for that to happen a few compromises may have to be done along the way, compromises that might unavoidably affect a minority of XBMC users, but we must always prioritize the majority or we not only want to survive but also grow as project and as a community that maintain and help evolve that project.
My 2 cents (and the project's members common vision of XBMC's future)
http://xbmc.org/about/vision/
Best regards / Andreas Setterlind (a.k.a. Gamester17)
jmarshall
2008-09-20, 00:54
Cheekyboy: I'm HIGHLY surprised at the memory usage in particular, and am also suprised at the time taken (though less so if it's being chunked)
I can quite happily show 25k songs in the songs node on xbox (64Mb of ram) - takes about 10 seconds if I don't have it chunked (when chunking, it obviously takes more time as we effectively have a n(n-1)/2 algorithm involved). Removing the chunking completely should be done for non SDL platforms, certainly.
Perhaps you could provide your db file so that we can test?
The queries themselves are simple - check MusicDatabase.cpp. GetSongsByWhere() is the function in question. To remove the chunking take a look in GetSongsByNav() - I can't recall if I've already killed it for the SDL builds or not.
Cheers,
Jonathan
Cheekyboy
2008-09-20, 01:31
Hi Jonathan
I was surprised too. Its actually quite a dog on the mac. Even the artists view (25k) takes over 20 seconds. I tried a mac SQLite client app MesaSQLite and that was pretty poor too using over 500Mb just doing a select * from song but it might be the app itself and how it builds the data grid. I'd have to try some others to get a better comparison.
Another thing I find strange is that the memory isnt freed when you exit the song view. Maybe its a strange Macism, or there maybe a bug there. Obviously the memory is freed when you close the app but exiting can take a good 10 seconds to complete once in that state. I dont know if its lack of memory (1Gb) thats causing the excessive timings but I'll probably get some more memory and get my putty knife out in the next couple of weeks so I'll let you know when I up the memory
Songs and artists have always been the heavy hitters on the xbox so these are the worst cases on the mac too. Searching is pretty quick though which is a bonus. I'm more than happy to send you the db to try it should rar up quite small.
I'll see if I can get beta1 onto my PC and see how it works there too.
Cheekyboy
2008-09-20, 01:38
forgot to mention the chunking. DOH! I'd have thought the whole purpose of chunking was to only return a portion of the data at time? maybe with a second count(*) just to get the total count. And then as the user pages through the fetched data you fetch the next chunk. This improves the user perception of speed as they get some data displayed nice and quickly whilst subsequent data is loaded in the background whilst they browse. Is this not how the chunking is/was implimented?
jmarshall
2008-09-20, 01:52
Awesome - if you could rar it up and upload it somewhere (dunno whether it'll rar down to email-able size?) that'd be great.
The chunking was put in place primarily for memory considerations - we basically slap on "limit 5000 offset 5000*iteration" to the query to get it to return a smaller chunk. The whole list is fetched at once. Obviously the time taken to fetch the second 5000 is about twice that of the first - dunno whether that can be improved or not (as I say, I'm not an SQL guru by any means ;))
Updating the list as we go is unfortunately fraught with difficulties - the main one being the sorting which the user can change at any time, and is thus done after the fetch. We could update as we go, inserting in the appropriate place, but this could prove problematic without making the whole directory fetching side of things asynchronous, which causes more issues with other stuff. Certainly something to be considered for the future though.
Cheers,
Jonathan