add experiment
Component: cr-action:add experiment (v1.0.0)
Added by: gfursin (2019-10-16 07:55:47)
Authors: Grigori Fursin
License: BSD 3-clause (code) and CC BY-SA 4.0 (data)
Source: GitHub
Creation date: 2014-12-06 15:37:32
CID: 081173242a88bc94:b9a24907ddc53c96cr-action:add experiment  )

Sign up here to be notified when new results are reproduced or new CodeReef components are shared!

How to get and run stable version (under development):
  pip install codereef
  cr download module:experiment --version=1.0.0 --all
  ck add experiment --help

How to get and run development version:
  pip install ck
  ck pull repo:ck-analytics
  ck add experiment --help

How to run from Python:
   import ck.kernel as ck

                ... See JSON API below ...
   if r['return']>0: return r
Info about the CK module with this action: experiment
Workflow framework: CK
Development repository: ck-analytics
Module description: universal experiment entries
API Python code: Link
    Input:  {
              dict                          - format prepared for predictive analytics
                                                (\"dict\")               - add to meta of the entry (useful for subview_uoa, for example)

                                                (\"meta\")               - coarse grain meta information to distinct entries (species)
                                                (\"tags\")               - tags (separated by comma)
                                                (\"subtags\")            - subtags to write to a point

                                                (\"dependencies\")       - (resolved) dependencies

                                                (\"choices\")            - choices (for example, optimizations)

                                                (\"features\")           - species features in points inside entries (mostly unchanged)
                                                                           (may contain state, such as frequency or cache/bus contentions, etc)

                                                \"characteristics\"      - (dict) species characteristics in points inside entries (measured)
                                                \"characteristics_list\" - (list) adding multiple experiments at the same time
                                                                         Note: at the end, we only keep characteristics_list
                                                                         and append characteristics to this list...

                                                                         Note, that if a string starts with @@, it should be 
                                                                         of format \"@@float_value1,float_value2,...
                                                                         and will be converted into list of values which
                                                                         will be statistically processed as one dimension in time
                                                                         (needed to deal properly with bencmarks like slambench
                                                                         which report kernel times for all frames)

                                                (pipeline_state)       - final state of the pipeline

                                                (choices_desc)         - choices descrpition
                                                (features_desc)        - features description
                                                (characteristics_desc) - characteristic description

                                                (pipeline)             - (dict) if experiment from pipeline, record it to be able to reproduce/replay
                                                (pipeline_uoa)         -   if experiment comes from CK pipeline (from some repo), record UOA
                                                (pipeline_uid)         -   if experiment comes from CK pipeline (from some repo), record UOA
                                                                           (to be able to reproduce experiments, test other choices 
                                                                           and improve pipeline by the community/workgroups)
                                                (dict_to_compare)      - flat dict to calculate improvements


              (experiment_repo_uoa)         - if defined, use it instead of repo_uoa
                                              (useful for remote repositories)
              (remote_repo_uoa)             - if remote access, use this as a remote repo UOA

              (experiment_uoa)              - if entry with aggregated experiments is already known
              (experiment_uid)              - if entry with aggregated experiments is already known

              (force_new_entry)             - if 'yes', do not search for existing entry,
                                              but add a new one!

              (search_point_by_features)    - if 'yes', find subpoint by features
              (features_keys_to_process)    - list of keys for features (and choices) to process/search (can be wildcards)
                                                   by default ['##features#*', '##choices#*', '##choices_order#*']

              (ignore_update)               - if 'yes', do not record update control info (date, user, etc)

              (sort_keys)                   - if 'yes', sort keys in output json

              (skip_flatten)                - if 'yes', skip flattening and analyzing data (including stat analysis) ...

              (skip_stat_analysis)          - if 'yes', just flatten array and add #min

              (process_multi_keys)          - list of keys (starts with) to perform stat analysis on flat array,
                                              by default ['##characteristics#*', '##features#*' '##choices#*'],
                                              if empty, no stat analysis

              (record_all_subpoints)        - if 'yes', record all subpoints (i.e. do not search and reuse existing points by features)

              (max_range_percent_threshold) - (float) if set, record all subpoints where max_range_percent exceeds this threshold
                                                      useful, to avoid recording too many similar points, but only *unusual* ...

              (record_desc_at_each_point)   - if 'yes', record descriptions for each point and not just an entry.
                                                Useful if descriptions change at each point (say checking all compilers 
                                                for 1 benchmark in one entry - then compiler flags will be changing)

              (record_deps_at_each_point)   - if 'yes', record dependencies for each point and not just an entry.
                                                Useful if descriptions change at each point (say different program may require different libs)

              (record_permanent)            - if 'yes', mark as permanent (to avoid being deleted by Pareto filter)

              (skip_record_pipeline)        - if 'yes', do not record pipeline (to avoid saving too much stuff during crowd-tuning)
              (skip_record_desc)            - if 'yes', do not record desc (to avoid saving too much stuff during crowd-tuning)

    Output: {
              return        - return code =  0, if successful
                                          >  0, if error
              (error)       - error text if return > 0

              update_dict   - dict after updating entry
              dict_flat     - flat dict with stat analysis (if performed)
              stat_analysis - whole output of stat analysis (with warnings)

              flat_features - flat dict of real features of the recorded point (can be later used to search the same points)

              recorded_uid  - UID of a recorded experiment
              point         - recorded point
              sub_point     - recorded subpoint

              elapsed_time  - elapsed time (useful for debugging - to speed up processing of \"big data\" ;) )


All versions:

Public comments

    Please log in to add your comment!

If you notice inapropriate content that should not be here, please report us as soon as possible and we will try to remove it within 48 hours!