All three airmass
numbers are describing the same entity, thus they supposed to be identical.
Assuming that the engine is a closed system, we should be able to assume
that the airmass in the intake should result in the
same airmass approximated by the wideband sensor on
the exhaust side. This is a perfect setup for performing calibrations.
We can either calibrate the MAF by juxtapositioning
the MAF-resulting CAM against the
IFR*IPW*AFRwb=15*MAF/RPM
IFR*IPW*AFRwb=GMVE*MAP/TEMP
Once you perform all these calculations for your log, you end up with three new
columns, and they all estimate cylinder airmass.
In the perfect world, they would be identical. In this world, they
won't. It would be pertinent however to take a look at how do the
resulting values differ from each other when you look at them not one value at
the time, but as a whole dataset.
The silly side effect of doing calibrations this way is that if you plot the various
CAM pairings against each other, they should form a straight line, after all
y=x, so all points in the set should be of the (x,x)
form, for example (0.1,0.1), (0.3,0.3), (0.55,0.55) etc... This makes it
very appealing to look at this problem graphically, as most data should be
right along the y=x line, while the more troublesome data points will be easily
spotted simply by visual inspection of the graph. We'll see more of this
later, but for now just try to expand your thinking
from single-point calculations to calculations for a whole set, and their
graphical representation.
So this picture has the CAM from fuel on the horizontal axis, and it has the
two airmass predictors (
Excel isn't particularly useful for exploratory data analysis. Matlab has some new 'data brushing' functionality in it,
but it's still a bit clunky. A while back I found a program that is quite
perfectly suited for such a task: Tableau. Officially it's a
program for 'business intelligence,' whatever that means. I use it because
it's lighting quick to adapt to changes, which allows me to rip through
hundreds of different scenarios and ways of looking at the same data. Not
only it does the regular charting, but you can group, filter, color, summarize,
subtotal, etc on just about any number of parameters. I've been using it
for few years now, but I haven't had a really good clear case where I could
show off why it's cool. Searching for the source of outliers in a large
dataset is a good showcase of what it can do for us.
First, I did a graph of SD-sourced CAM against the fuel consumption-sourced
IAT also does not seem to be the cause, as the outliers seem to have the same
values as many other samples that fall directly in line. What about ECT? All
I had to do is change the source of color from IAT to ECT, and we get a new
graph.
ECT also has a variety of values for the outliers, but again, no clear pattern
emerges in which the outliers would react to different ECT than the 'proper'
values. What else could it be? How about different throttle inputs?
In this graph I scaled the coloring in such a way that the off-throttle/very
light throttle would show up as gray. There's a definite uniformity in
that the outliers all occured at off-throttle/very
light throttle. So let's summarize what we know so far: outliers
occur are independent of temperatures, and they occur only at very light
throttle and very low MAP values. Could it be deceleration? If it is,
DFCO could be activating, wreaking havoc on fueling. So how else would
the DFCO show up in our data? If it lives up to it's
name, it cuts fuel delivery, causing abnormally high lean condition.
Let's color up our graph based on the AFR from the wideband sensor:
AFR is definitely significantly lean in all the outliers. So I think it
is quite safe to say, that the DFCO caused lean condition is the cause behind
some of the CAM estimations being severely off, creating these
visually sticking out outliers.
Another not only cool, but also very useful part of Tableau is the ability to
select groups of points, for either inclusion or exclusion. I selected
all the outliers, and I excluded them from the graph, creating this:
Isn't is a much cleaner graph? Look at the
values on the axis--no more samples with 1.6g/cyl, which is
achievable only on a FI car with a generous amount of boost. There are
still some values that stick out, that are not exactly on the trend line, but
they are the inherent noise in the system.
Now that we know how to get rid of outliers, let's compare MAF and SD directly
against each other:
MAF is on the bottom, SD is up top. They both have
noise, but they both also do a very good job of sticking to the trend
line. The MAF seems to be a little noisier, especially on the low end of airmass values, where SD seems to excel. This is
exactly why the GM uses the hybrid MAF/SD model: they leverage the
stability of SD for lower airmass situation, and MAF
for the higher airmass. It it quite literally the best of both worlds.
Since everything we've done so far been graphical, let's
take a look at some numbers, to see if they back up what we could see by
observing patterns.
There's a bunch of statistics tricks that are used to
describe how closely two functions are to each other. I'm not going to
explain all of them here, but here's the rundown for these who know what they
mean:
Trend Lines Model
MAF:
SSE (sum squared error): 2.19023
MSE (mean squared error): 0.0002055
R-Squared: 0.998423
Standard error: 0.0143366
p (significance): <
0.0001
SD:
SSE (sum squared error): 1.07925
MSE (mean squared error): 0.0001014
R-Squared: 0.999398
Standard error: 0.01007
p (significance): <
0.0001
SD wins overall. Better R^2, lower SSE and MSE,
smaller standard deviation of errors.
That was for the cases where we cleaned up the data.
Out of curiosity, let's see how they fair when we look at the data before
the cleanup.
MAF:
SSE (sum squared error): 22.7658
MSE (mean squared error): 0.0020751
R-Squared: 0.983645
Standard error: 0.0455531
p (significance): <
0.0001
SD:
SSE (sum squared error): 29.177
MSE (mean squared error): 0.0026694
R-Squared: 0.983759
Standard error: 0.0516667
p (significance): <
0.0001
In this case, MAF does a little better than SD.
So to wrap up:
Hopefully this cleared some things up, as I've been getting
a lot of questions lately about my methods of calibration.