Skip to main content Skip to complementary content

Scheduling Data Sampling and Profiling

The profiling process can take a lot of resources and time. You can schedule to run the process periodically and limit it either by maximum duration (e.g. 2h) or amount of data (e.g. 300Mb). The bridge saves profiling results in the MIMB cache as soon as possible. When it profiles multiple files it saves profiling results of each file as soon as they are ready. If the bridge fails it can restart at the latest point available in the cache. When the bridge detect the file did not change (e.g. checksum is the same) it can update the profile time for the file.

The bridge tries to return as much as possible to the caller when it:

  • Completes
  • Fails
  • Reaches the time or volume limit.

Steps

  1. See Manage Schedules for guidance on scheduling an operation.
  2. Specify the model to sample as the OBJECT.
  3. Select Data sampling and profiling as the OPERATION.
  4. Here you may specify all the profiling and sampling options.

Example

Sign in as Administrator and go to MANAGE > Schedules. Click create new. Provide “S and P” as the NAME. Specify the model to sample as the OBJECT. Select Data sampling and profiling as the OPERATION. Specify the other profiling and sampling options.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!