New in the Community? Get started here

Schneider Electric Exchange Community

Discuss and solve problems in energy management and automation. Join conversations and share insights on products and solutions. Co-innovate and collaborate with a global network of peers.

Register Now
Resource Advisor - Performance Analytics Forum
cancel
Showing results for 
Search instead for 
Did you mean: 
Lieutenant

Using historical data expressions to do CUSUM alerts

This recipe is part of the Calculated Measurments Cookbook

 

Hi All,

 

I am trying to use the new SUM PER XXX Last Year expression.

 

First question is: The expression seems to ignore part years, that is, if there is no data for a full 12 months, it doesnt output anything (And it would seem the same if its 0's?).  Is there a way around this?

 

For the example below, I have a 'target series of 0's for 12 months, and then a constant monthly value for 12 months.  However, the SUM PER MONTH Last Year expression only output values in the last 2 months.

What I was hoping for was an accruing value over the year in question.

 

Taking a step back, perhaps if any of you have time, I'd appreciate any thoughts on whether there would be a better way to achieve our required outcome, or where I am perhaps getting the use of the 'Last year' expression wrong.

 

Required Outcome

The client has an annual contracted quantity (ACQ) of gas.  If they exceed it, they get extra charges.  We can chart their progress using an index in business analytics in RA easily.  But what we hoped to do, was take it a step further and set an alert when their consumption is starting to get ahead of the ACQ throughout the year.

 

The Plan

(Italicised expressions havent been fully tested, just my first thoughts. I can use them to get a monthly %, but not CUSUM yet)

 

For each year in question, create a monthly index (CY?? ACQ Target), which can be prepopulated for the coming 12 months (and can be bulk uploaded for all client sites).

[[IntervalCoverage=>Exact]][[DisaggregationMethod=>LinearInterpolate]]{RAMetric( CY17 ACQ, GJ ) }

 

Calculate the CUSUM Target in PAM.

SUM PER DAY Last year ( [Gas:CY17 ACQ Target][GJ] )

 

Use the ACQ target as a mask to isolate the appropriate CY of energy data (Using an 'if' statement, knowing values for the target would only be available for the required CY) and do a CUSUM of the energy data.

SUM PER DAY Last year (IF( [Gas:CY17 ACQ Target][GJ] > 0 , RAMetric( Site Gas), 0))

 

Calculate the % difference over the CUSUM values ofr CY ACQ target & CY Energy.

(({RAMetric( Site Gas) } - {RAMetric( CY17 ACQ, GJ ) }) *100 / {RAMetric( CY17 ACQ, GJ ) })

 

Allow disaggregation on absolutely everything!

 

Then create a daily alert on the disaggregated CUSUM %.

 

Thoughts

 

Am I pushing it too far? I would think this should be possible, but I havent seemed to be able to get the 'Last year' expression working as I had intended it.

 

Are there any other quirks around its use?

 

With so many stacked calculations, any tips on where best to use the disaggregation? Does disaggregation even work for alerts?

 

Thanks!

9 REPLIES 9
Commander

Re: Using historical data expressions to do CUSUM alerts

I'm going to go right ahead and ask Anthony Gray​.

Tony, your thoughts?

Kim KREUTZ​, what client is this, so we can log in and see live?

Michael Schmitz
Lieutenant

Re: Using historical data expressions to do CUSUM alerts

Hi Mike, its Valmont Industries.  I've been mucking around with the Hexham site specifically.  Thanks!

Commander

Re: Using historical data expressions to do CUSUM alerts

Hi Kim,

On my first reading of what you want to do with that target, I'm not sure that we even need to use the time-shift operation.  I think you might be looking for the CUSUM operation:

CUSUM PER YEAR( SUM PER MONTH([Gas:CY17 ACQ Target][GJ] ))

and

CUSUM PER YEAR( SUM PER MONTH({RAMetric( Site Gas) }))

will give you ramps that reset every year for the target and the actual consumption.  You can take the ratio of those two and alarm on that maybe?

Let me know if this gets you closer,

Tony.

PS:

Time-shift expressions like "SUM PER MONTH last year (expression)" reach back and get the data for the same month in the previous year, then use it in the expression for the month on the X axis.  So in the chart you showed where it had no values in the middle of 2017, that's because it's showing you the values from 2016 shifted forward and there wasn't anything for those months in 2016.

The time-shifter lets you mix data from the past with data from the present in a single expression so you can compute things like the ratio of "this month" to "same month last year".  An expression like

SUM PER MONTH([my location])/SUM PER MONTH LAST YEAR([my location])

in a chart that covers 2017 will give you a monthly series of ratios (Jan 2017/Jan 2016, Feb 2017/ Feb2016, Mar 2017/ Mar 2016 and so on), plotted on the time axis of the months in 2017.

Lieutenant

Re: Using historical data expressions to do CUSUM alerts

Hi Anthony Gray​, that was excellent thanks! Very much clarifies things out. I had my head around these expressions the complete wrong way!

Just a follow up question:

If I subtract the two CUSUM PER YEARS and disaggregated them, could I set a daily alert based on it exceeding a certain threshold (even though data only comes in monthly)?

Actually, just trying to divide the two streams, is seems to only be doing is based on the summated end year, as opposed to the % difference at that given month through the cusum. Any ideas how to get that working right? (As well as the alert?)

Thanks again!

Commander

Re: Using historical data expressions to do CUSUM alerts

Hi Kim, I think I got it.

The trick was something that even I didn't realize about how CUSUM works.  Though the purpose of the time period in the CUSUM (YEAR in this case) is just to establish the reset frequency of the accumulation, that period was also being interpreted as the output period of the data.  That's wrong - the output period of CUSUM is just whatever the period of the argument to the CUSUM is - in this case it's monthly but if that had been hourly or interval data, the output would have been hourly or interval.  CUSUM doesn't actually touch the output interval at all, unlike the other 'normal' aggregations.

But the rest of the expression (the division) was looking at the  output of those CUSUMs and mistakenly interpreting them as annual values, dividing them to get an overall yearly value, then rolling the result back down to monthly by just repeating it.

I changed your formula in two ways:

  1. I put a directive in to change the yearly period of the output of the CUSUM back to monthly since it shouldn't have been changed to yearly in the first place (I'll probably change CUSUM in the future to not do this)
  2. I changed the formula from Energy/Target to (Energy - Target) / Target * 100, so it comes out as a pure under/over percent, but based on the year to date values.  I'm just guessing at what you wanted, please change back as needed.

You can check that it's working as intended in the above screenshot because the yellow line for the cumulative deviation from target crosses zero everywhere that the dark blue line for cumulative Energy crosses the light blue line for the cumulative target.

You should be able to alarm on this new value and alert the customer when their accumulated YTD energy is over their pro-rated target.

Some things to remember if you want to do further tweaking:

- these CUSUM calculations always start accumulating at the start of whatever the current chart is, despite the somewhat misleading names.  Just because this one says 'CUSUM PER YEAR' doesn't mean that when viewing a chart from June - December that the CUSUM will represent a sum from January before the start of the chart.

- If you want to do charts that straddle years and you don't want it to reset every January (in this case you probably do want it to reset every January because it's an annual target) you would replace CUSUM PER YEAR to CUSUM ALL, which just starts from zero at the beginning of a chart and never resets.

Lieutenant

Re: Using historical data expressions to do CUSUM alerts

Amazing stuff, thanks Tony!

Commander

Re: Using historical data expressions to do CUSUM alerts

I made the mentioned repair to CUSUM so that the output period correctly reflects the input period and the workaround I did in this one case won't be needed in the future.  It's in the next release.

Highlighted
Lieutenant

Re: Using historical data expressions to do CUSUM alerts

Thanks again Anthony Gray​, out of interest, what exactly does the 'jitterto' expression do? How is it different to disaggregation and/or just reusing SUM per X (but for a shorter period than the previous statement, which I've seen work before)?

Commander

Re: Using historical data expressions to do CUSUM alerts

Hey Kim, good question.  The JitterTo property (as well as the companion policies DefaultJitter and JitterMode) get involved when timestamped data first enters an expression from a database, and affects how timestamps are treated.  It's designed to fix the small imperfections in timestamps (usually a few seconds of delay) that result in logged timestamps not quite being sharply on the minute (or 15 minute boundary, or hourly or whatever).  So if a timestamp came in at 13:00:47 and you had hourly jitter correction enabled, it would turn that timestamp into 13:00:00.

Jitter correction is actually also done in EEM as data arrives in the database, so most of the time you don't need to also do it in expressions.  It can be handy to do it in an expression though if it got missed in EEM or if you want options that EEM doesn't handle well.

It differs from the time aggregations in a couple of ways:

1) Jitter is only applied when data enters an expression from 'outside' meaning a database or other data source

2) Jitter doesn't do any summing - if you're jittering to the hour and you have two readings in that hour, the first one will win and the second one will be ignored

3) Jitter will snap a timestamp to the nearest boundary, even if it's a bit _early_, so 12:59:07 will snap to 13:00:00 with hourly jitter correction.

4) Time aggregation operations don't modify timestamps at all - they take them at face value.  So if you say SUM PER HOUR and your data value starts at 1 minute after the hour, it will try to disaggregate a minute's worth of the previous reading to fill in the little gap before the current reading starts.  Similarly, it will try to pro-rate and chop off the minute at the end if it runs into the next hour.  Which in our world usually shows up as 'disaggregation required but not allowed' errors.

I just went to look up a link to this feature in the documentation, to discover that it's not there.  I'll get the documentation updated as quickly as I can.  In the mean time, here's an excerpt:

There are two policies and one stream property that combine to control interval creation from time points:

  1. Policy: DefaultJitter: { None | BelowHourly | HourlyOrLonger | All } choose which streams will have timestamp correction applied.  In PAM, this is set to "HourlyOrLonger", to correct nominal timestamps but leave meter data alone as it is already jitter corrected by EEM at import time.
  2. Policy: JitterMode: { Nearest | Containing | Auto}
    1. Containing: If on an endpoint and the TimestampPosition is 'IntervalEnding', substract a tick and return the containing interval.  This is best for stable misleading 'nominal' timestamps but doesn't work well for communication jitter.
    2. Nearest: find nearest endpoint of interval containing point, then if IntervalSize> 0, subtract/add  a tick depending on TimestampPosition, then use the interval that contains that value.  Classic jitter correct, and available at 1, 5, 10, 15, 20 and 30 and 60 minute.
    3. Auto: use 'containing' if the interval size is >= 60 minutes, otherwise 'nearest'. This is the default option.
  3. Property: JitterTo: { dimensionname | auto | none }
    1. auto: at resolution time the interval size is used to find the first time dimension with the same length and that is used as the jitter dimension.  Whether new timestamps are snapped to the 'nearest' interval or are set to the 'containing' interval depends on the JitterMode policy in effect.  For this to work, the IntervalSize of the input stream must be known.  The size of the interval time dimensions used for comparison is the length of the interval containing 'now'. 
    2. dimensionname: The actual dimension to be snapped to. This can always be set through a stream property directive as needed.  If 'auto' was initially chosen, then after the determination of the matching dimension, its name will be put in this property of the resulting stream, replacing the 'auto' tag.
    3. none: needed to override specific cases where jitter correction is not wanted for a stream when the DefaultJitter policy would otherwise turn it on for a give stream.  Also set when auto determination of the time dimension failed (nonperiodic stream for example, or no data) but other policies indicated that an exception isn't wanted.

Duplicate values

If two values fell in the same interval, they would likely emerge with identical time intervals even if they started out with different timestamps.  The current behaviour of the calculation engine is to discard duplicates, meaning the first one is used and the second one ignored.  If you are expecting intermediate values that you want included, then shorten the time interval size of those generated values and do an explicit aggregation to the desired interval SUM PER HOUR().