Excel VSM Simulator
A still picture of a Value Stream Map may not fully explain the behaviour of the Value Stream. In some cases we need to understand its dynamic behaviour. To do this we will convert our Excel Value Stream Map into a Monte Carlo Simulator.
In Excel Value Stream Map we saw how a complex process could be Value-Stream-Mapped using Excel. We saw the importance of mapping feedback loops such as test-repair loops because they often become the bottleneck for the complete process. In this situation not all items flow along all branches so we need to calculate the % flowing along each branch. This enabled us to calculate the cycle for each workstation based on the staffing. Finally we were able to calculate the ideal staffing to balance the line: insure no workstation cycle is above the average inter-arrival time (takt time). This we did using Solver.
Value Stream Map: Still Picture Vs Process Dynamic Behavior
Mapping the current process with a VSM is a good starting point to understand how the process behaves but this instant picture of the process may not be representative of its behavior in the long run.
To further analyse the process behavior we need variation data on its critical parameters such as:
- Inter arrival time
- Process times
- First Pass Yields
Download this Excel simulator VSMandSimulation1.xlsm from OneDrive folder:
A value stream map is not complete without Work-In-Process (WIP) between process steps data. The problem with WIP data is that it typically varies widely along time so a snap shot of the moment we visit the line may be very misleading. Rather than collect WIP data from the line in our VSM Simulator WIP before each workstation (column Q) is calculated each hour:
New WIP = Previous WIP + Items processed by previous step – Items processed by this step
Items processed (column R) are calculated each hour based on Average time (K), Time Standard deviation (L) and Staffing (M) taking into account the constraint of enough WIP in the previous step.
Gemba data collection to estimate process standard deviations
We need to measure inter arrival times and process times for each process step. To do that we take 10 measurements of each in Gemba and calculate the average and standard deviation:
We assume that process times follow a normal distribution:
Process times: Normal distribution with averages (K) and standard deviations (L)
On the other hand looking at the Inter-arrival times we notice that the standard deviation is greater than its average and the shape of the distribution is telling us that it is closer to an exponential distribution so we will use this assumption in our simulation:
Inter arrival time distribution: Exponential with average K3
We now enter these averages and standard deviations in columns K and L of our simulator.
Lead time and capacity utilization
Lead Time in column T, for each process step, is the average time it takes for an item to be processed in that step including the waiting time due to the WIP in front of it.
Capacity utilization in column U is the average % time the workstation has been operating since the start of simulation based on its theoretical capacity.
WIP (Q) and Capacity Utilization (U) are shown on bar charts on the right:
Total products received and returned
Elapsed hours simulated are in B2. Products received for repair, products repaired and products returned without repair are shown in column A:
Overall process parameters
Total values for columns K to U are shown in row 1:
K1 represents the average Value Add Time (total process time) per product
M1 is the total staffing in the line (Full Time Equivalent)
P1 is the maximum value of all cycle times: it corresponds to the line bottleneck capacity. P1 should be ≤ K3 (Takt)
Q1 is the total WIP in the line.
Products received for repair should still be in the line (WIP) or they should have been returned to the customer repaired or unrepaired:
Received (A3) = WIP (Q1) + Repaired (A13) + Returned without repair (A8)
T1 is the Total Lead Time (hours): The average lead time seen by the customer (not including the customer contribution)
U1 is the overall line capacity utilization.
The Value add to total time rate can be calculated:
Value Add Time / TOTAL Time = K1 / (T1 x 60)
To run the simulator:
Press F9 to simulate 1 hour. Keep pressing to run several hours. The counter in B2 shows the elapsed simulator hours.
To start simulation with an empty line: press Ctrl key + r or press the Reset button. This will reset all line WIPs to zero.
To simulate 100 hours press the RUN 100 button
In the tab Evolución you can see how all Process and WIP values evolve along each hour in the main steps as well as the evolution of total Lead Time:
During the first hours, when the line is empty, there is no waiting time along the process so the total lead time is basically equal to the value add time (one hour). As queues start building up as shown in the WIP chart, waiting time starts to increase in all workstations and that causes the total lead time to increase to a value of 12 hours, where it seems to stabilize. This means that if we take a snap shot of the line in hour 7 basically all lead time is value add. On the other hand if we look at the line in hour 22, when stability seems to be reached, lead time is 12 times value add time: NVA/ VA = 11. Non value add time is made up of all the waiting taking place in front of each workstation.
We can see that, although the average inter-arrival time is 5 min (12 items per hour), 3 arrivals of more than 30 items each have collapsed receiving causing an accumulation of 80 items at reception which has eventually been absorbed by the line.
This is a typical case where most of the variability comes from the customer inter arrival times as seen in the process chart.
Tracking this evolution you can discover unexpected bottlenecks caused by variability that may need to be reinforced with additional staff. We can also estimate the total lead time we can commit to customers. In this case if the process is stable that would be 12 hours minimum.
These are some of the effects we can detect in our simulation:
- Accumulation of WIP in front of a workstation due to insufficient staffing
- Low capacity utilization due to excess staffing
- Accumulation of WIP in front of the repair station due to low first-pass-yield in test (N11)
- Accumulation of WIP in reception (Q4) due to arrivals above plan
- High overall lead time due to excess total WIP in the line
- Test first pass yield below plan which would create a bottleneck both in the test and repair stations
- Highly unbalanced line as shown by large differences in capacity utilization
- Proportion of products under warranty different to plan: it would require line re-balancing
Capacity Utilization Analysis
Repair and Functional test are the two operations which add value and consume most resources in the value stream so we would expect to have high capacity utilization in both. When there is an increase in arrivals we would expect a queue to form before them causing an increase in the overall lead time. We don't want other steps which require low staffing to become a bottleneck for the line so with a small increase in the staffing of those steps which have high capacity utilization we can make sure they never become the bottleneck:
These small increases have increased the overall staffing by 1 person and in this way we have reduced their capacity utilization. You can now see that their cycle has been reduced well below 5 (the takt time). The end result will be a reduction of the average overall lead time for the line.
We can see the line performance along time now:
Test First Pass Yield drop simulation
We want to find out what would be the effect of a drop in test First Pass Yield from the planned figure of 80% to 70%
Looking at the evolution we see a WIP build up first in repair and later on in test:
Looking at the cycle time of both Functional Test and Repair we notice that they have increased from 5 to 5.4, therefore they have become the bottleneck for the total process.
In a situation like this we would analyse the failing items to look for the root cause of their failures to correct it. But on the mean time, if we need to continue the process, we will need additional test and repair capacity (increase test and repair staffing).
You can see Test & Repair Loop simulation in Test/ Repair Loop
- A static Value Stream Map is a good start to understand how the process behaves
- Feedback loops such as test-repair loops are an essential part of the process so they should be included in the VSM
- A snap shot of the WIP situation along the line may not be representative of the normal operation
- If the WIP situation is not typical the NVA calculation will not be correct
- The VSM simulator provides a deeper understanding of the process behavior and it enables what-if studies to optimize the process
- Simulation helps us understand some of the failure mechanisms caused by variability so we can act on the root cause to make real improvements
You can see some additional process variation effects in Value Stream Constraints