|VP of Customer Success||Zay Hanlon (@zayhanlon)|
|Customer Success Manager (CSM)||Jason Lewis (@patagonia121)|
|Customer Success Manager (CSM)||Michael Pinto (@pintomi1989)|
|Customer Support Engineer (CSE)||Kathy Satterlee (@ksatter), Grant Bilstad (@Pacamaster), Ben Edwards (@edwardsb)|
|Infrastructure Engineer||Robert Fairburn (@rfairburn)|
- To make a request of this department, create an issue and a team member will get back to you within one business day (If urgent, mention a team member in the #g-customer-success).
- Any Fleet team member can view the kanban board for this department, including pending tasks and the status of new requests.
- Please use issue comments and GitHub mentions to communicate follow-ups or answer questions related to your request.
The customer success department is directly responsible for ensuring that customers and community members of Fleet achieve their desired outcomes with Fleet products and services.
Occasionally, we will need to track public issues for customers who wish to remain anonymous on our public issue tracker. To do this, we choose an appropriate minor planet name from this Wikipedia page and create a label which we attach to the issue and any future issues for this customer.
Locate the relevant issue or create it if it doesn't already exist (to avoid duplication, be creative when searching GitHub for issues - it can often take a couple of tries with different keywords to find an existing issue). When creating a new issue, make sure the following:
- Make sure the issue has a "customer request" label or "customer-codename" label.
- Occasionally, we will need to track public issues for customers that wish to remain anonymous on our public issue tracker. To do this, we choose an appropriate minor planet name from this Wikipedia page and create a label which we attach to the issue and any future issues for this customer.
- "+" prefixed labels (e.g., "+more info please") indicate we are waiting on an answer from an external community member who does not work at Fleet or that no further action is needed from the Fleet team until an external community member, who doesn't work at Fleet, replies with a comment. At this point, our bot will automatically remove the +-prefixed label.
- Required details that will help speed up time to resolution:
- Fleet server version
- Agent version
- Osquery or fleetd?
- Operating system
- Web browser
- Expected behavior
- Actual behavior
- Details that are nice to have but not required. These may be requested by Fleet support as needed:
- Amount of total hosts
- Amount of online hosts
- Amount of scheduled queries
- Amount and size (CPU/Mem) of the Fleet instances
- Fleet instances CPU and Memory usage while the issue has been happening
- MySQL flavor/version in use
- MySQL server capacity (CPU/Mem)
- MySQL CPU and Memory usage while the issue has been happening
- Are MySQL read replicas configured? If so, how many?
- Redis version and server capacity (CPU/Mem)
- Is Redis running in cluster mode?
- Redis CPU and Memory usage while the issue has been happening
- The output of fleetctl debug archive
- Have we provided a link to that issue for the customer to remind everyone of the plan and for the sake of visibility, so other folks who weren't directly involved are up to speed (e.g., "Hi everyone, here's a link to the issue we discussed on today's call: …link…")?
The acting developer on-call rotation is reflected in the 📈KPIs spreadsheet (confidential Google sheet). The developer on-call is responsible for responses to technical Slack comments, Slack threads, and GitHub issues raised by customers and the community, which the CSE team cannot address.
- To reach the developer on-call for assistance, mention them in Fleet Slack using
@oncallin the #help-engineering channel.
- Support issues should be handled in the relevant Slack channel rather than Direct Messages (DMs). This will ensure that questions and solutions can be easily referenced in the future. If it is necessary to use DMs to share sensitive information, a summary of the conversation should be posted in the Slack channel as well.
- An automated weekly on-call handoff Slack thread in #g-engineering provides the opportunity to discuss highlights, improvements, and hand off ongoing issues.
- Customer Success Manager: Follow the training steps for this role.
- Customer Solutions Architect (CSA): Follow the training steps for this role.
- Customer Support Engineer (CSE): Follow the training steps for this role.
- Infrastructure Engineer: Follow the training steps for this role.
- A new message is posted in any Slack channel
- (Zapier filter) The automation will continue if the message is:
- Not from a Fleet team member
- Posted outside of Fleet’s business hours
- In a specific customer channel (manually designated by customer success)
- (Slack) Notify the sender that the request has been submitted outside of business hours and provide them with options for escalation in the event of a P0 or P1 incident.
- (Zapier) Send a text to the VP of CS to begin the emergency request flow if triggered by the original sender.
Note: New customer channels that the automation will run in must be configured manually. Submit requests for additions to the Zapier administrator.
Fleet's self-service license key creator is the best way to generate a proof of concept (POC) or renewal/expansion Fleet Premium license key.
- Here is a tutorial on using the self-service method (internal video)
- Pre-sales license key DRI is the Director of Solutions Consulting
- Post-sales license key DRI is the VP of Customer Success
Legacy method: create an opportunity issue for the customer and follow the instructions in the issue for generating a trial license key.
Customer Support and 24/7 on-call Engineers are responsible for the first response to Slack messages in the #fleet channel of osquery Slack, and other public Slacks.
- The 24/7 on-call is responsible for alarms related to fleetdm.com and Fleet Managed Cloud, as well as delivering 24/7 support for Fleet Premium customers. Use on-call runbooks to guide your response. Runbooks provided detailed, step-by-step instructions to quickly and effectively respond to and resolve most 24/7 on-call alerts.
- We respond within 1-hour during business hours and 4 hours outside business hours. Note that we do not need to have answers within 1 hour -- we need to at least acknowledge and collect any additional necessary information while researching/escalating to find answers internally.
The first responder on-call for Managed Cloud will take ownership of the @infrastructure-oncall alias in Slack first thing Monday morning. The previous week's on-call will provide a summary in the #g-customer-success Slack channel with an update on alarms that came up the week before, open issues with or without direct end-user impact, and other issues to keep an eye out for.
- First responders: Robert Fairburn, Kathy Satterlee
Escalation of alarms will be done manually by the first responder according to the escalation contacts mentioned above. A suspected outage issue should be created to track the escalation and determine root cause.
- Escalations (in order): » Eric Shaw (fleetdm.com) » Zay Hanlon » Luke Heath » Mike McNeil
All infrastructure alarms (fleetdm.com and Managed Cloud) will go to #help-p1. When the current 24/7 on-call engineer is unable to meet the response time SLAs, it is their responsibility to arrange and designate a replacement who will assume the @oncall-infrastructure Slack alias.
The following stubs are included only to make links backward compatible.