In this paper stochastic programming techniques are adapted and further developed for applications to discrete event systems. We consider cases when the sample path of the system depend discontinuously on control parameters (e.g. modeling of failures, several competing processes), which could make the computation of estimates of the gradient difficult. Methods which use only samples of the performance criterion are developed, in particular finite differences with reduced variance and concurrent approximation and optimization algorithms. Optimization of the stationary behavior is also considered. Results of numerical experiments and convergence results are reported.