MCStream: very slow reading performance towards end of stream
2 posters
Page 1 of 1
MCStream: very slow reading performance towards end of stream
I'm using the MCStream library (shipped with MC_Rack) in a C++ application to read .mcd files. The problem that I'm facing is that the read speed is *very* slow towards the end of the stream (using the CMCSAStream->GetRawDataOfChannel function).
By slow I mean an average read speed of about 5-6 MiB/s (~700k samples/s) for reading a single channel of a 30 min recording (@25 kHz). If I constantly read the same amount of data (1 MB chunks over and over again) from the same file only at the beginning of the stream (using CMCSAStream->GetRawDataOfChannelStartCount with an offset of zero) reading performance is about 50 times (!) faster. Using the Neuroshare library gives the same results (which I guess is expected since I think that it is built on top of MCStream).
Is there a way to overcome this and improve reading speeds?
*EDIT*: the above figures are for an external USB3 hard drive. Copying the same file to an internal SSD and reading one channel gives somewhat better results (33 vs 250 MiB/s for full stream/beginning of stream only) - but still a factor 10 difference;
Thanks,
Tiago
By slow I mean an average read speed of about 5-6 MiB/s (~700k samples/s) for reading a single channel of a 30 min recording (@25 kHz). If I constantly read the same amount of data (1 MB chunks over and over again) from the same file only at the beginning of the stream (using CMCSAStream->GetRawDataOfChannelStartCount with an offset of zero) reading performance is about 50 times (!) faster. Using the Neuroshare library gives the same results (which I guess is expected since I think that it is built on top of MCStream).
Is there a way to overcome this and improve reading speeds?
*EDIT*: the above figures are for an external USB3 hard drive. Copying the same file to an internal SSD and reading one channel gives somewhat better results (33 vs 250 MiB/s for full stream/beginning of stream only) - but still a factor 10 difference;
Thanks,
Tiago
tiagogehring- Posts : 2
Join date : 2014-11-28
Re: MCStream: very slow reading performance towards end of stream
Hello Tiago
Reading of data with the MCStream library is indeed not very fast. The acceleration you get by reading only the beginning might be due to the automatic disk caching
of Windows.
If you are flexible how you read the data, I would recommend converting the data to HDF5 format (via our DataManager tool, see our website) and then use the library from the HDF5 group (http://www.hdfgroup.org/HDF5/). The data access via HDF5 is *way* faster than our library. Also the interface to read the data is easier to use as with either MCStream or Neuroshare.
We do have wrappers for Matlab and Python, which make reading those data even easier, unfortunately we don't have a C++ wrapper yet
best regards
Hans
Reading of data with the MCStream library is indeed not very fast. The acceleration you get by reading only the beginning might be due to the automatic disk caching
of Windows.
If you are flexible how you read the data, I would recommend converting the data to HDF5 format (via our DataManager tool, see our website) and then use the library from the HDF5 group (http://www.hdfgroup.org/HDF5/). The data access via HDF5 is *way* faster than our library. Also the interface to read the data is easier to use as with either MCStream or Neuroshare.
We do have wrappers for Matlab and Python, which make reading those data even easier, unfortunately we don't have a C++ wrapper yet
best regards
Hans
Hans MCS- Posts : 16
Join date : 2008-08-19
Re: MCStream: very slow reading performance towards end of stream
Thanks Hans!
I was looking again at the MCStream library and saw the GetRawData() function that extracts data from all channels at once. I did some testing and this function is very fast. On my SSD I can now extract data at around 580 MiB/s (or, more precisely, extract 60x4.5e7 samples from a 30 min recording in ~9.3 seconds!). Problem is, that I have to either have a lot of memory to hold that amount of data at once (almost 6 GB in this case), or do the processing block-wise. This is something that I wanted to get to anyway, so I think that I will go this route instead of pre-converting the files to HDF first (which would involve another processing step).
In any case thanks again for your help; I will keep the HDF route as an alternative in mind.
Tiago
PS: on a side note, why does the MCStream depends on an old Boost library (1.44) in Windows but not other platforms? Since I use Boost myself, albeit versions from >1.55 onwards (because of some new libraries and better compiler and C++11 support) this causes major headaches.. (I'm pretty sure that I will run into conflicts if I link with 2 different Boost versions). I'm thinking in packing the MCStream + static boost libraries in a DLL and hope that __declspec( dllimport )/__declspec( dllexport) (non-standard, Visual C++ only) work, but this (even assuming that it works) is of course non-ideal.
I was looking again at the MCStream library and saw the GetRawData() function that extracts data from all channels at once. I did some testing and this function is very fast. On my SSD I can now extract data at around 580 MiB/s (or, more precisely, extract 60x4.5e7 samples from a 30 min recording in ~9.3 seconds!). Problem is, that I have to either have a lot of memory to hold that amount of data at once (almost 6 GB in this case), or do the processing block-wise. This is something that I wanted to get to anyway, so I think that I will go this route instead of pre-converting the files to HDF first (which would involve another processing step).
In any case thanks again for your help; I will keep the HDF route as an alternative in mind.
Tiago
PS: on a side note, why does the MCStream depends on an old Boost library (1.44) in Windows but not other platforms? Since I use Boost myself, albeit versions from >1.55 onwards (because of some new libraries and better compiler and C++11 support) this causes major headaches.. (I'm pretty sure that I will run into conflicts if I link with 2 different Boost versions). I'm thinking in packing the MCStream + static boost libraries in a DLL and hope that __declspec( dllimport )/__declspec( dllexport) (non-standard, Visual C++ only) work, but this (even assuming that it works) is of course non-ideal.
tiagogehring- Posts : 2
Join date : 2014-11-28
Re: MCStream: very slow reading performance towards end of stream
Hello Tiago
I will have a look at the boost issue, maybe we can go to a new version in the next update (I don't think we can remove it at all without major changes to the library code, which I want to avoid).
Hans
I will have a look at the boost issue, maybe we can go to a new version in the next update (I don't think we can remove it at all without major changes to the library code, which I want to avoid).
Hans
Hans MCS- Posts : 16
Join date : 2008-08-19
Similar topics
» 'Error reading raw data stream electrode raw data. Excluded from conversion'
» Using mcstream.dll with C#
» MCStream.dll
» problem reading file content
» problem reading file content
» Using mcstream.dll with C#
» MCStream.dll
» problem reading file content
» problem reading file content
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum