MIL_ID SplitContextClassId, | //in |
MIL_ID SrcDatasetContextClassId, | //in |
MIL_ID DstFirstDatasetContextClassId, | //in |
MIL_ID DstSecondDatasetContextClassId, | //in |
MIL_DOUBLE Percentage, | //in |
MIL_ID SplitResultClassId, | //in |
MIL_INT64 ControlFlag | //in |
This function splits the source dataset context into the first and second destination dataset contexts, according to a percentage. This is typically done to split all of your data (source dataset context) into the training dataset context (first destination) and the development dataset context (second destination) for MclassTrain(). All datasets (source and destinations) must be for the same type of classifier; that is, you must have allocated all of them with either M_DATASET_IMAGES or M_DATASET_FEATURES.
You can consider the split a type of copy of dataset entries from the source to the first or second destination (all source dataset entries are copied). This function does not alter the source dataset context. Any entries currently in the destination dataset contexts are overwritten. Source entries are only copied to one destination.
MclassSplitDataset() ensures that the proper portion of entries for each class in the source dataset context are in both the first and second destination dataset contexts.
To establish the first and second destination dataset contexts, source entries are randomly selected (split). By default, a different random selection of entries is chosen each time you call this function. To select the same random entries, use M_SPLIT_CONTEXT_FIXED_SEED.
If the source dataset context contains augmented entries, they can end up in both the first and second destination dataset contexts. In this case, you must not use one of these destinations as your development dataset context or testing dataset context, since only the training dataset context can contain augmented entries. To inquire whether the source dataset context has augmented entries, use M_NUMBER_OF_AUGMENTED_ENTRIES.
An augmented entry is never split from the source entry with which it was augmented. This can cause the percentage of entries in the first and second destination dataset contexts to differ from the specified percentage.
Specifies the predefined split classification context. This predefined context establishes whether to use a different random selection of entries to split the source dataset context (default), or to use the random selection of entries associated to a fixed seed.
Set this parameter to one of the following values:
For specifying the predefined split classification
context
|
|||||||||||||||||||||||||||||||||||||||
Value | Description | ||||||||||||||||||||||||||||||||||||||
M_SPLIT_CONTEXT_DEFAULT |
Specifies to use a different random selection of entries from the source dataset context to establish the first and second destination dataset contexts. |
||||||||||||||||||||||||||||||||||||||
M_SPLIT_CONTEXT_FIXED_SEED |
Specifies to use the random selection of entries that is associated to a fixed seed. In this case, the first and second destination dataset contexts are established using the same random selection of entries from the source dataset context. This allows you to repeat the same split operation, provided that you make an identical call to MclassSplitDataset() (that is, you specify the same dataset contexts and split percentage). (summarize)Specifies to use the random selection of entries that is associated to a fixed seed. (more details...) |
Specifies the percentage of entries in the source dataset context that should be in the first destination dataset context. The remaining entries in the source dataset context will be in the second destination dataset context. The percentage value can range between 0.0 to 100.0, inclusive.
Header | Include mil.h. |
Library | Use mil.lib; milclass.lib. |
DLL | Requires mil.dll; milclass.dll. |