Talk about the application and design of front-end log base in SaaS products

original
2021/02/20 15:30
Number of readings 338

Article | Senior front-end development engineer of NetEase Smart Enterprise

1、 Preface

The business of my company is mainly to provide enterprises with full process enterprise services and a set of SaaS solutions. For enterprise service SaaS products, the customer's purchase does not mean that the product value has been fully delivered, because when the customer first purchases a product, it often needs a series of training and use before it can truly generate value. Therefore, in essence, helping customers solve problems in the use process is a paid service provided in the B-end product, which is a very important link in the product value chain.

This article will focus on the application of front-end log libraries in the scenario of developers troubleshooting customer feedback problems, and how to design and develop front-end log libraries suitable for such scenarios.

2、 Challenges faced by the front end of SaaS business

I used to work in C-end business, naturally with the thinking of C-end business, and think that it is enough for the front-end to achieve the ultimate product interaction experience. When I applied this practice to the SaaS business, I really suffered a lot. There are differences in payment methods, purchase decision-makers and actual users, and product uses between B-end products and C-end products (the fundamental reason why B-end customers buy products is to help enterprises make money, The purchase decision of C-end products may be just impulsive and fun), which leads to a large difference between the indicators that SaaS enterprises pay attention to and the C-end products, and indirectly leads to a different orientation to the R&D side. In terms of R&D resource investment, the C end business front end may optimize the web performance to improve user stickiness for the sake of user experience regardless of cost, which is often concerned by mobile Internet, e-commerce and other industries DAU MAU GMV Equivalent level indicators. The core concern of B-end products is whether most of them can Help customers improve efficiency Whether the product can help customers achieve their work goals and help them achieve their goals quickly is much more important than whether the product interface is beautiful. There is an important indicator to measure an excellent SaaS enterprise—— NDR For the front-end development of SaaS business, the first challenge to be solved is how to improve the product through software research and development Ease of use Task efficiency service efficiency So as to improve the numerator of NDR (renewal of existing customers+additional purchase) for enterprises.

Comparison between end B and end C

End B C-end
User scenarios Clear purpose to help enterprises improve efficiency and quality. The user's purpose is not clear, mainly for pleasure and killing time.
Page interaction mode Rigorous process, low risk and high efficiency Simple operation, simple information, entertainment and sociability
Common payment methods Prepaid annually Free Admission
Common operating indicators NDR、CAC DAU、MAU、GMV

It is worth mentioning that in this regard, the Cloud Computing Software Product Use Experience Quality Measurement Model and Measurement Method also proposes five indicator dimensions to measure product use experience, which is very referential. These indicator dimensions include Ease of use Task efficiency Satisfaction uniformity Page performance Among them, ease of use includes ease of operation, ease of learning, and clarity. Task efficiency includes function utilization, task completion rate, and task completion time. Based on the consideration of revenue sustainability of SaaS products, one of the goals of SaaS enterprises is to increase the proportion of output value relying on software products and reduce the proportion of output value relying on manual services, because only when the marginal cost of output value of software products is the lowest, can product service efficiency be continuously improved. In this regard, the marginal cost of relying solely on human services is ultimately very high. Therefore, in the SaaS business scenario, we rely on technological innovation to Improve the efficiency of problem solving It is a very large product value that the front end can provide.

3、 How to solve customer feedback problems

When we focus on solving customer feedback problems, we can first divide customer feedback problems into functional consultation, problem reporting and pre-sales consultation. Among them, developers mainly focus on problem reporting. There are also some function consultation problems that cannot be answered by technical support will flow to developers. For these problems, we can generally build an internal problem troubleshooting system to solve them. Among them, the feedback problems that front-end developers mainly encounter are not only from SDK access, but also from the problem reports that customers think the product functions do not meet expectations.

For this reason, in order to effectively and quickly locate the causes of these problems, the front end can, on the one hand, log on the client and report to the problem troubleshooting system; on the other hand, it can quickly communicate with customers if necessary for urgent problems of Class S and Class A customers (based on the hierarchical model of SaaS enterprises for customer enterprise size), Use tools such as Remote Assistance to replicate and locate the cause of the problem on the customer's device. For the latter, we have designed and developed a * * remote debugging solution based on the Chrome DevTools protocol of the Chrome browser woodpecker-remote It can support website developers to remotely debug the Chrome browser of website users directly. For the former, we have designed and developed Front end log library** woodpecker-log To support persistent storage of client running status and other information for developers to debug and troubleshoot problems.

4、 The concept of front-end logs

First, let's introduce the concept of front-end logs. Generally speaking, the concept of log is often heard during back-end development. For the back-end, log refers to a file used to record the startup and running status of the server. The front-end log here refers to the log file used to record the running status of the client, which is stored on the client or uploaded to the server for storage. In general, it is enough for the front-end to use the Console to record the running status in the development and test environment, but in the production environment, it is necessary to send the client log information to the server for storage, so that it can be used when troubleshooting and locating user feedback problems in the future.

5、 Problems and challenges of traditional front-end logs

  1. The front-end log database has a low penetration rate, poor perception, and numerous exceptions and performance monitoring.
  2. The log is not standardized, and various log formats are confused.
  3. The log library itself occupies the front-end performance budget. In terms of performance, it needs to consider reducing the cost of occupying the main JS thread as much as possible to ensure that the main thread is as idle as possible. Some front-end log libraries store logs on the server side. After the logs are generated, they need to be uploaded to the server immediately, which takes up bandwidth and the maximum concurrent number of browser and domain name requests, thus slowing down normal business Ajax requests. It is necessary to balance the reporting frequency and the size of each reporting log.
  4. Large applications (such as those with tens of millions or hundreds of millions of users) are prone to generate huge logs. The correlation between logs and bugs is low, and the cost of retrieval and analysis operations is high. For example, using Elastic search to store massive logs, Kibana query efficiency is low.
  5. It is inconvenient to develop and deploy logs. There are some problems here. Front end developers need to embed the code for printing logs in advance. Otherwise, when the problem needs to be located, there is no related log for analysis.
  6. The logstore lacks the context of the problem and cannot trace a complete session. For example, it does not support recording user interface interaction operation records before and after the user accesses the page, as well as page jump and other behavior tracks.
  7. The problem feedback link is long. If it cannot be deeply integrated with the product, it is easy to lose clues in the middle of the feedback link, resulting in high communication costs. The solution can be to automatically bring the current log information when the customer reports a problem, and associate the internal work order, Jira and other problem feedback and resolution systems.

6、 Design of front-end log base based on B-end business

Among the above problems, the first problem to be solved is the correlation between logs and user feedback problems. The core idea is to use the client to store logs. When problems occur, users or programs will report them actively, rather than reporting to the server regularly. Here are two questions: How do users find problems? How does the programmer find problems? In addition, performance problems and JS exceptions are also one of the sources of customer feedback problems, but from the statistics of the sources of customer feedback problems in daily SaaS business, these two are not the main sources. In addition, the two areas of JS exception monitoring and performance monitoring already have relatively mature front-end infrastructure support. Therefore, customer feedback problems caused by non JS exceptions and performance problems are mainly covered and solved by the front-end log library.

6.1 Some typical scenarios where front-end logs need to be recorded

First, before starting the design, think about how the front-end will use the log library. These typical scenarios may require front-end logging.

  1. Call the third-party service and prepare for the worst. What should we do if the third-party service is unavailable.
  2. For pages with low performance budget and sensitive performance, some performance data needs to be reported.
  3. For the core process of the web page that needs to be monitored, when the JS assertion result is false, the reason for the assertion failure needs to be recorded.

For the third scenario, here is a simple list of Demos that use the front-end log library to record logs when the program assertion is false:

 Demo

6.2 Maintainability of the Logstore SDK

Compared with the maintenance of thousands of lines of code in a single file, it is more maintainable to develop the SDK independently into a project and adopt the front-end engineering method. Front end engineering refers to the adoption of modular, component-based, standardized and automated technical solutions to solve engineering quality and maintainability problems from the perspective of software engineering. Here are some examples Selection of key technologies

  1. programing language

For the SDK's underlying code, the Typescript language naturally provides type documents that are readable and readable. Static type checking can help framework or library users find errors before the code runs. Smart syntax awareness can provide useful API type hints.

  1. structure

It is necessary to consider supporting multiple JS module specifications for SDK users. Take rollup as an example. The configuration is as follows:

 Take rollup as an example

  1. automated testing

In the development phase, automated testing of SDKs mainly focuses on unit testing and integration testing. Unit testing is used to verify the correctness of modules, functions or classes. You can use the jest framework. It is worth noting that you need to simulate the database when unit testing indexedDB storage and query. You can use fake-indexeddb to mock. The purpose of integration testing is to assemble various modules between systems and use real external dependencies to access the real indexedDB database to verify the correctness of the code. In this example, we used Karma+Mocha+chai to test Chrome Headless, Firefox Headless, and Safari browsers.

  1. version control

Based on semantic version specification semver Version control.

  1. Demo and Document

6.3 Using indexedDB to store logs on the client

LocalStorage is suitable for key value storage of a small amount of data. It is more suitable to use indexedDB in the client log storage scenario. It has the following advantages:

  • Store and query structured data, support binary
  • Support transactions
  • asynchronous
  • Processing large amounts of data

Assuming 10M capacity, three hundred 34952 bytes of logs can be saved; Up to 8 days of average daily cycle recording four thousand three hundred and sixty-nine Article.

6.4 Performance overhead

  • Network performance (latency, request failure rate) - log length, request volume
    • Use an independent domain name server to process log requests. Chrome limits the maximum number of concurrent connections to the same domain name
    • DNS prefetch
    • String compression before log storage
      • LZMA-JS is implemented using JS based on Gzip algorithm. The measured Level6300bytes has a compression ratio of 79%
  • sendBeacon
    • The data is transmitted reliably and asynchronously, without delaying the unloading of the page or affecting the loading of the next navigation
  • Merge Request
    • Consolidate multiple small volume logs and report them in pages, with a single page of about 1M
  • HTTP/2 header compression
  • Operational performance
    • Full asynchronous non blocking operation, storage, retrieval and reporting
    • Maintaining the storage queue supports batch log storage operations

6.5 API design

SDK Initialization Settings

 SDK Initialization

parameter type interpretation Default Optional
options.appKey String The application name that will be stored when the instance records the log. It is used to distinguish the logs recorded by different applications. When not transferred, the instance uses $anonymous as the application name $anonymous Optional
options.bytesQuota Number Set the upper limit of indexedDB storage available to the client in MBytes. Different applications share the upper limit of storage. When the upper limit is exceeded, the cycle recording function will be enabled to automatically delete the oldest log ten Optional
options.reportUrl String The report method will use this address as the server address for reporting logs after it is passed in. If it is not passed in, you need to specify this parameter when calling report -- Optional
options.enableSendBeacon Boolean When enabled, use sendBeacon to report logs FALSE Optional
options.debug Boolean After starting, print debugging information on the client console FALSE Optional

Instance method

method interpretation Example
trace/info/warn/error/fatal Log to client wpLog.trace(content: string);
queryByDate Retrieve logs by occurrence time wpLog.queryByDate(startDate?: number, endDate?: number);
queryByContent Retrieve logs by keyword wpLog.queryByContent(content: string);
report Log reporting to server wpLog.report(startDate?: number, days?: number, reportUrl?: string, session?: boolean, env?: boolean);

6.6 What to do if there is a problem online but no log is printed in the code

We often need to design the logs that need to be printed during the execution of business critical processes in the code before publishing. Otherwise, when we need to locate the problem, we find that we have not output relevant logs, which will be more passive. At this time, you can only temporarily change the code and log, and then publish again. Is there a solution that you can add logs to the corresponding locations in the code when you encounter problems, so that when you execute the business process, you can print the relevant logs immediately without going through the publishing process again. Here is an introduction to the woodpecker-proxy By using the MutationObserver interface to listen to script insertion DOM events, rewrite the browser's JS request, and proxy it to the target server, so that you can modify JS in the target server and add log code to record logs in the user's browser. Demo address: DEMO

6.7 Log specification - print logs by level and application

Logs recorded according to good practices are conducive to filtering logs quickly according to information level and application when troubleshooting problems.

Grading

log level interpretation
trace It mainly outputs debugging content.
info Record the normal operation status of the system. Some important business processes have ended.
warn When a problem of this level occurs, the processing can continue, but extra attention must be paid to this problem.
error When an error occurs, it has affected the normal access of the user and needs to be handled immediately, but the emergency level is lower than FATAL.
fatal Fatal error. A very serious problem has occurred in the system, and someone must deal with it immediately.

Sub application

Because the client storage is restricted by the same source, log access can only be performed under its own domain name. Multiple applications may record logs under the same domain name. It is easy to isolate log information of different applications by distinguishing application names for storage.

6.8 What information needs to be collected in the context of the problem

  • Device, browser, page information
  • User interface interaction associated with a session
  • Page jump associated with a session

6.9 How to quickly find relevant logs when receiving customer feedback

  1. The user id and session id are stored in the log and indexed.
  2. Integrate the log reporting function into the SaaS application, and automatically query and report the current session log when customers feed back problems.
  3. Write the user ID and session ID into the internal work order system when the customer gives feedback, and bring them to the Jira system when the bug is raised.

7、 Direction of future efforts

  • Enhance reliability
    • How to ensure that the front-end log library still works normally when a web page crashes, and record the logs at this time? Use Service Worker to monitor web page crashes.
  • More intuitive problem context
    • Use the browser screen recording scheme to record the user interface before and after the problem occurs.
  • More friendly customer notification and alarm
    • Use Notification desktop notifications.

8、 References

Expand to read the full text
Loading
Click to lead the topic 📣 Post and join the discussion 🔥
Reward
zero comment
zero Collection
zero fabulous
 Back to top
Top