I recently embarked on an experiment for a client. Could their website session data predict if they were going to hit their target booked appointment numbers or not? When I shared the results in a recent edition of The Huddle, the responses I got were from marketers of all stripes who were excited to try this method for themselves! It turns out that a lot of people want to do this kind of forecasting but aren’t sure where to start or how to take it past the spreadsheet phase.

In this post, I’ll show you how to go through the whole workflow. It involves getting the data set up properly, running the regression in Google Sheets, testing whether same-week or lagged traffic is actually the better predictor, and then building a live forecast chart in Looker Studio that tells you at a glance whether a client is on track to hit their monthly target. These kinds of predictive metrics aren’t locked behind a fancy business intelligence system, all you need is Google Sheets to get started!

What Data You Need — And How to Get It

My initial spreadsheet had two columns. The first is the leading indicator (in my case, website sessions) and the business outcome you’re trying to predict (appointments, leads, revenue — whatever you’re trying to predict). Both need to be broken down by the same time period.

I’ve found weekly is almost always the right granularity for this kind of analysis. Daily is too noisy because you end up with weekends, holidays, weather events, and other random fluctuations that make it hard to see the underlying relationship. Monthly gives you so few data points that the regression doesn’t have much to work with (12 data points per year isn’t a lot to establish a pattern). Weekly is a nice in-between as it smooths out the day-to-day noise while still giving you enough data to find something real.

For sessions, I pulled weekly data from GA4. I used a Looker Studio table which I then exported to Google Sheets. There are also some third-party add-ons that can automate this into a spreadsheet if you’re doing it repeatedly (for multiple clients, for example).

Before you start, it’s worth taking a moment to audit your GA4 account and make sure your session data is actually reliable. If there are tracking gaps, weird session numbers from misconfigured cross-domain setups, or other oddities, you’ll be building an analysis from questionable data, and the results you get won’t hold up to scrutiny, even if they look great on the surface.

Getting the business outcome data is often the harder part of this whole process. For this particular client, it was appointments from their booking system. For other clients, it might be leads from a CRM, form submissions from their website, or people counter results from their physical store locations. The key is getting it broken down by week and making sure the week definitions match. If the leading indicator (e.g. sessions) weekly definition starts on Sunday and your CRM data starts on Monday, you’ll introduce noise into the relationship before you’ve even done any analysis.

For this client, I had three years of weekly history to work with. That’s a great starting point. In my experience, even six months can give you something useful to work with to start. Just make sure to be honest with yourself about how much confidence you can actually have in a relationship established on 26 data points versus one established on 156 data points!

How to Run a Linear Regression in Google Sheets

This is where I’ll admit I had to ask Claude how to do this again, because I haven’t run a regression since a stats class I took in 1996 as part of my geography degree. The good news is that Google Sheets does all the hard work for you!

Step 1: Build the scatter plot.

Select both columns of data — I had sessions in the first column and the appointments in the second — and insert a chart. Change the chart type to a scatter plot. Sessions went on the x-axis (the input), and appointments went on the y-axis (the thing we’re predicting).

Before you continue on, take a moment to look at your scatter plot. Is there a general upward pattern, or does the data look like pure randomness? If you don’t see something generally trending at this point the analysis probably isn’t going to turn out well. But, if you can see some sort of a rough diagonal pattern, you’re in good shape.

Here is what my original scatterplot looked like, which was pretty encouraging!

Step 2: Add the trendline.

Edit the chart, go to the Customize tab, open the Series section, and check the Trendline box.

Step 3: Display the equation and R².

Still in the Series section, look for the Label dropdown under the trendline options and select “Use Equation”. That’s what displays the formula. The Show R² option is a separate checkbox below it, select that as well.

Note: If you don’t see the R² value appear, make sure your chart is showing the Legend or else it won’t show up.

Once you’ve done this, you’ll see an equation in the format y = mx + b. For the client I was working with, the formula came out as:

Appointments = (Sessions × 0.511) + 1604

Let’s walk through what each piece means.

The slope (0.511) tells you the rate of conversion from sessions to appointments. In this case, roughly every two additional sessions is associated with about one more appointment. That’s the web-driven relationship.

The intercept (1604) is the really critical part. It represents the baseline number of appointments you’d theoretically see even with zero website sessions. This would include all the non-web drivers like repeat customers, word of mouth, referrals, radio ads, and phone calls from people who found the number on a business card. For a service business with a lot of returning customers (as this business is), this number can be quite large, and that’s okay!

The of 0.347 means that website sessions explain about 35% of the variation in booked appointments. That’s a moderate relationship, not a strong one, so should we worry about that?

We know that the other 65% of variation is driven by things outside this model: seasonality, whether a competitor ran a sale, what the weather was like (genuinely relevant for this business), and just all the weird randomness that exists when you run a business. Since we’re asking a directional question (“are we on track?”) rather than an exact question (“how many appointments will we have?”) a moderate R² can absolutely be useful. Part of why it isn’t stronger might be that your analytics data isn’t perfect to begin with, and that’s worth keeping in mind when you’re interpreting results.

Testing Time Offsets: Does This Week's Traffic Predict This Week or Next?

I wanted to be sure that we were on the right track with this analysis, and so I tested three versions of the data: same-week (sessions and appointments from the same seven-day period), one-week offset (this week’s sessions predicting next week’s appointments), and a two-week offset (this week’s sessions predicting appointments two weeks from now).

The same-week relationship turned out to be the strongest for this particular client. That does make sense for this business as many of their appointments aren’t heavily considered in advance and the entire buying cycle happens within days.

But of course for other businesses, that might not be the case. A B2B service with a longer consideration cycle, or a home renovation company where people research for weeks before committing, might show a much stronger relationship with an offset. In that case the website visit is still a leading indicator, it just has a longer lag time.

It’s easy to add in the offset, just shift your outcome column down by however many rows of offset that you want to introduce. It’s worth trying a few options and comparing the R² values against what you know about the typical lead time for this business.

This is also a really important conversation to have with clients! “How long does it typically take from someone finding your site to actually booking?” opens up a useful discussion about their sales cycle that goes beyond analytics.

Building a Live Forecast Dashboard in Looker Studio

Now we have the numbers, but we also wanted to build a way to visualize how things are going. To create this, we’ll create a chart that shows cumulative actual performance through the current date, and the regression-derived target line showing where you’d expect to be at current session rates. When a client looks at this chart, they can see immediately whether they’re pacing ahead, behind, or on track for the month.

A note on data sources before we start: This chart uses blended data. In my case, I was pulling GA4 sessions and the client’s appointment data from BigQuery. If your appointment data (or whatever outcome you’re tracking) lives in a spreadsheet, a Google Sheets connector works the same way. This is one case where you can’t connect to GA4 alone in Looker Studio, because the chart needs to see both datasets together to calculate the target line.

Step 1: Create the blend

Start by blending the two data sources together. I find the easiest way to blend data is to create two tables with the dimensions and metrics that you need, select them both, right-click, and choose Blend.

The join configuration should be Left Outer with your session source (GA4 in my case) as the left data source, and then the appointment source (BigQuery in my case) as the right table. Use the date in both sources as the joined value.

For metrics, I am pulling in sessions from GA4, and appointments booked from BigQuery.

Step 2: Create the target metric

You’ll create the target metric in the blended data source itself. Normally I don’t do this but because of the math you have going on, Looker Studio will not work unless the calculation is in the blend itself and not on the final chart.

You’ll have two metrics in your source data, which in my case was GA4. The first metric is sessions, and then the second metric will be a calculated field that I called Target. To create the target field, click Add metric, then click Add calculated field.

In the calculated field box, you’ll add this calculation that is based on your regression formula. For this client, it’s:

(Sessions / COUNT_DISTINCT(Date) * 28 * 0.511) + 1604

This calculation takes the daily average session rate over the window we’re looking at, projects it across a 28-day period, and converts that to a predicted appointment count using the regression equation. The result is a dynamic target line that updates automatically as traffic data comes in. Make sure to change the 28 number to match however many days you’ll want to look at in your time series chart!

Step 3: Create the time series chart

Add a new time series chart using your blended data source. The dimension (x-axis) will be date, and you’ll have two metrics (y-axis): Sessions and Target.

Open up the Style tab and for the first series (Sessions), select Cumulative and add a Linear trendline.

Step 4: Create the target line

Scroll down further in the Style tab and find the Reference Lines section. Click Add a reference line and select the Line type as Metric, then select your Target metric for the value. The Line Calculation should be Average by default, then style the line however you’d like. I added a label and made the line solid so it’s easy to see.

Step 5: Set up the rolling date window

Create a new time series chart with your blended data source, then go to the Setup tab for your date range control, click the calendar icon, and select Advanced from the date picker dropdown. From there, set the start date as Today → Minus → 7 → Days, and the end date as Today → Plus → 21 → Days.

Using this method gives you future end dates in your chart, which gives you a window that will always show the last week of actuals plus three weeks of forward projection without any manual updating. Of course based on your lead time, you may want to extend out these values. I’m using a four week rolling timeframe for this client.

Step 6: Read the chart

This is an example of what you might get:

When the cumulative trendline hits the target line before you run out of room on the right side of the chart, the client is pacing ahead which means that they’re on track to exceed their target (which is what you see in the screenshot above). When it’s not hitting the trend line before you run out of chart, that’s an early warning sign. Depending on how bad the slope is you can adjust accordingly. For example, if it looks like you’re only off by a day or two it might be just a small adjustment. If it looks like your cumulative line won’t hit the target well beyond the end of the time period you’re evaluating, that’s time to ring the alarm and make some course corrections. This chart gives you the lead time you need to respond to potential missed goals, before it happens.

What This Approach Can't Do (And When to Go Further)

I want to end this off by going over some of the limitations of this method. It’s useful but isn’t a magic solution.

Seasonality can be a huge factor. If your client’s business has strong seasonal patterns, running a single regression across all seasons can produce a misleading picture. The relationship between traffic and bookings might be strong within any given season, but weak-looking across the whole year because the scale of both changes so dramatically. In that case, consider running separate regressions by season or by quarter and see whether the R² values are more meaningful at that granularity.

Correlation isn’t causation. The regression tells us that sessions and appointments move together consistently enough to be useful for pacing but it doesn’t tell us that traffic causes appointments. Keep that R² value in mind! That doesn’t make the relationship less useful for forecasting, but it’s worth mentioning when you’re having conversations about what levers actually drive desired outcomes.

GA4’s built-in predictive metrics aren’t the answer here. GA4 does have predictive metrics, but they’re focused on purchase probability and churn, not service business outcomes like appointments. On top of that, they require at least 1,000 returning users who have triggered the relevant condition over the previous 28 days to even be eligible. For many smaller businesses, that threshold is out of reach. And even if you hit it, predictive metrics are only available in the audience builder and Explorations, which means they’re far less accessible in Looker Studio and frankly I rarely want to send a client into the Explorations part of GA4. So for appointment forecasting, rolling your own regression is likely the more practical path.

Single-variable regression has a ceiling. An R² of 0.347 means 65% of the variation is unexplained. If you want to do better, the next step is multiple regression, where you introduce additional variables like channel, seasonality, or location. That’s when Google Sheets starts to struggle and you might consider Python, R, or a tool like BigQuery ML to do this analysis. Just be careful about slicing too finely. You don’t want to combine so many different factors that you end up with too few data points per segment to be statistically reliable.

Frequently Asked Questions

In academic marketing research, benchmarks run roughly 0.75 for a strong relationship, 0.50 for moderate, and 0.25 for weak. But those thresholds assume you’re building a precision model. In my experience working with service businesses, an R² of 0.30–0.35 is meaningful for human behavior prediction, which is the kind of model where you’re forecasting directional trends rather than exact numbers. I wouldn’t use these values if the R² came back below 0.10, because at that point the relationship is probably just sophisticated noise.

After adding a trendline (Customize tab → Series → Trendline), look for the Label dropdown in the trendline settings and set it to “Use Equation”. That’s what shows the formula. The Show R² checkbox is a separate option below it, make sure to check that as well to display the value.

The intercept is the predicted value when your input (sessions) equals zero. In this example, it represents all the business that comes in through non-website channels: repeat customers, referrals, word of mouth, radio ads, walk-in visits, etc. For a service business with a strong returning customer base, the intercept can be quite large, and that’s expected.

It depends on your business model and I recommend testing both options. For businesses where people search and book immediately (urgent home services, for example), same-week likely shows the strongest relationship. For businesses with a longer consideration cycle, a one- or two-week lag often produces a higher R². The test is straightforward: duplicate your spreadsheet, shift the outcome column down by one row, re-run the scatter plot, and compare R² values. Whichever version produces the strongest relationship is likely the right lag for that client.

Start With What You Have

You don’t need three years of history to try this. If you have sessions and a business outcome broken down by week for even the last six months, that’s enough to run the test and see whether a meaningful relationship exists.

What I find most valuable about this approach isn’t the math (although it is neat when it works!) but rather the shift in the conversation it enables. Moving from “here’s what happened last month” to “here’s whether you’re on track right now” is a fundamentally different kind of value to offer a client. And the tools to do it are ones you’re likely already using.

If the R² comes back low, that’s information too. It means website traffic isn’t a strong predictor for that outcome, and you can have a more grounded conversation about what actually is. And if it comes back moderate or strong, you have the foundation for a live forecast dashboard that earns its place in your marketing dashboards.

Black and white portrait of Dana DiTomaso

Dana enjoys solving problems that haven’t been solved before. With her 20+ years experience in digital marketing and teaching, she has a knack for distilling complex topics into engaging and easy to understand instruction. You’ll find Dana sharing her knowledge at digital marketing conferences around the world, teaching courses, and hosting a technology column.

Learn more about Dana