<div dir="ltr">Hi -<div><br></div><div>I have some code that creates a dataset from a bunch of ornate processing (figuring out which volumes to censor based on subject performance, subject motion, other scanning params; creating aggregate event types for certain events; etc.)  Dataset creation has been done, to this point, per-session: subjects were scanned twice, and each session is processed separately; and all that ornate stuff that is done will be different from session to session.</div><div><br></div><div>Now I want to aggregate these two sessions and do MVPA things after throwing all the data into the hopper.  Specifically, instead of using NFoldPartitioner on a day's worth of runs (there are 4 runs per day) and leaving one run out, I want to run it on the combined runs from both days (8 runs) and leave two out.  </div><div><br></div><div>PyMVPA is awesome and so it would be clear how to do this if all the data were aggregated together; but as I mentioned, to get the dataset in the right format I have to do a bunch of processing, and it would be a pain to combine all the various files to make this into one single aggregate set.  What I'd rather do is just glue together two separate datasets, which have already been processed in the ways they require, s.t. the new dataset just had the samples, targets, and associated attributes from the second session's dataset glued onto the first session's dataset.</div><div><br></div><div>The ds.samples variable reports as being a numpy.ndarray, so I figured I could just stuff them together with array operations, for instance:</div><div><br></div><div>combined_ds = np.append(ds_1.samples, ds_2.samples)</div><div><br></div><div>and so on for the targets, sample attributes, etc.  But nope, this gets screwed up immediately:</div><div><br></div><div>


<p class=""><span class="">In [<b>57</b>]: ds_1</span> = m.MVPAMaster("tp101", 1, "dc", "new_temporal_tp101_day1.nii")</p><p class=""><span class="">In [<b>58</b>]: </span>len(ds_1.ds)</p><p class="">


</p><p class="">Out[<b>58</b>]: <span class="">290</span></p><p class=""><span class="">In [<b>57</b>]: ds_2</span> = m.MVPAMaster("tp101", 2, "dc", "new_temporal_tp101_day2.nii")</p><p class=""><span class="">In [<b>58</b>]: </span>len(ds_2.ds)</p><p class=""></p><p class=""><span class=""></span></p><p class="">Out[<b>58</b>]: <span class="">290</span></p><p class=""><span class="">In [<b>64</b>]: </span>combined = np.append(ds_1.ds.samples, ds_2.ds.samples)</p><p class=""><span class="">In [<b>65</b>]: </span>len(combined)</p><p class=""><span class="">


</span></p><p class="">Out[<b>65</b>]: <span class="">16960360</span></p></div><div><span class=""><br></span></div><div><span class="">I'm thinking this is a mapper unrolling everything behind the scenes, maybe?  I could beat my ahead against this for a while, but I figured first I'd ask and see if there's a straightforward method to extending a dataset in this fashion?</span></div><div><span class=""><br></span></div><div><span class="">Thanks,</span></div><div><span class=""><br></span></div><div><span class="">Shane</span></div></div>