Scheduling: queue vs. cron job
- Current State: The complete scheduling is done by cron jobs including the entire post-processing, e.g. uncompressing the TGZ output, registering results, finishing tasks/requests.
- Potential Problem: The cron jobs start at fixes times, e.g. every 5 minutes, check the status and react accordingly. If the post-processing takes longer than the interval between two cron job starts, two or more cron jobs might run at the same time concurrently.
- Possible Solutions:
1. Lock the program that is being started by a cron job, e.g. check_tasks.lock, and don't start a new cron job before this lock file has been deleted again at the end of the first cron job's run.
2. Do not perform any "real tasks", e.g. post-processing, from within the cron job, but rather submit a new job to a special administration queue. Introduce new task states for post-processing as to exclude the job from being submitted multiple times into the queue within multiple cron job runs.
3. Implement a daemon that controls all activities instead of running several cron jobs potentially in parallel. Possible candidate: Schedule::Cron