XML parsing

XML parsing

original

2014/09/09 09:55

Reading number 939

The object library only provides a parsing mode for specified object elements, which is simple and convenient to use, but has some limitations. If you want to support xml parsing of big data and more flexible control of elements, you can directly use the xml module provided separately at the bottom of the tbox.

The xml library of tbox provides two parsing modes: DOM parsing and SAX parsing.

DOM adopts the dom object tree, which can be parsed to memory at one time. This is similar to object, but it can control all element tags. SAX mode adopts external iteration mode, with higher flexibility and performance. It also supports user-defined path resolution operations. Similar to xpath, you can select a specified path for resolution.

The DOM mode is relatively simple. Just look at the following examples to make it clear at a glance:

 //Initialize Stream tb_stream_ref_t istream = tb_stream_init_from_url(" http://localhost/file.xml "); if (istream) { //Open Stream if (tb_stream_open(istream)) { //Initialize Reader tb_xml_reader_ref_t reader = tb_xml_reader_init(istream); if (reader) { //Load data, root is the root node tb_xml_node_t* root = tb_xml_reader_load(reader); if (root)  { //Resolve node operation // ... //Release root node tb_xml_node_exit(root); } //Release Reader tb_xml_reader_exit(reader); } } //Release stream tb_stream_exit(istream); }

The SAX mode is more efficient and flexible, and better supports big data xml, because it uses the iterator mode, reads while solving, and only parses the data you are interested in, which saves more memory and does not need to load everything into memory. Therefore, with stream, the network data can be stream parsed.

I won't say much about it. Let's go directly to the code:

 //Initialize Stream tb_stream_ref_t istream = tb_stream_init_from_url(" http://localhost/file.xml "); if (istream) { //Open Stream if (tb_stream_open(istream)) { //Initialize Reader tb_xml_reader_ref_t reader = tb_xml_reader_init(istream); if (reader) { //Initialize xml reader events tb_size_t event = TB_XML_READER_EVENT_NONE; //Traverse all xml node elements. If an empty event is returned, it ends while ((event = tb_xml_reader_next(reader))) { switch (event) { //Xml document node type event case TB_XML_READER_EVENT_DOCUMENT:  { tb_printf("<?xml version = \"%s\" encoding = \"%s\" ?>\n" , tb_xml_reader_version(reader), tb_xml_reader_charset(reader)); } break; //Document Type Node Type Event case TB_XML_READER_EVENT_DOCUMENT_TYPE:  { tb_printf("<!DOCTYPE>\n"); } break; //Empty element node type event, for example:<element/> case TB_XML_READER_EVENT_ELEMENT_EMPTY:  { //Node element name tb_char_t const*         name = tb_xml_reader_element(reader); //Node element attribute list tb_xml_node_t const*     attr = tb_xml_reader_attributes(reader); //XML node hierarchy, used to display indented layout tb_size_t                 t = tb_xml_reader_level(reader); while (t--) tb_printf("\t"); //Traverse all element attributes if (! attr) tb_printf("<%s/>\n", name); else { tb_printf("<%s", name); for (;  attr; attr = attr->next) tb_printf(" %s = \"%s\"", tb_pstring_cstr(&attr->name), tb_pstring_cstr(&attr->data)); tb_printf("/>\n"); } } break; //Element start node event, for example:<element> case TB_XML_READER_EVENT_ELEMENT_BEG:  { //Node element name tb_char_t const*         name = tb_xml_reader_element(reader); //Node element attribute list tb_xml_node_t const*     attr = tb_xml_reader_attributes(reader);     //XML node hierarchy, used to display indented layout tb_size_t                 t = tb_xml_reader_level(reader) - 1; while (t--) tb_printf("\t"); //Traverse all element attributes if (! attr) tb_printf("<%s>\n", name); else { tb_printf("<%s", name); for (;  attr; attr = attr->next) tb_printf(" %s = \"%s\"", tb_pstring_cstr(&attr->name), tb_pstring_cstr(&attr->data)); tb_printf(">\n"); } } break; //Element end node event, for example:</ element> case TB_XML_READER_EVENT_ELEMENT_END:  { tb_size_t t = tb_xml_reader_level(reader); while (t--) tb_printf("\t"); tb_printf("</%s>\n", tb_xml_reader_element(reader)); } break; //Text Node Events case TB_XML_READER_EVENT_TEXT:  { tb_size_t t = tb_xml_reader_level(reader); while (t--) tb_printf("\t"); tb_printf("%s", tb_xml_reader_text(reader)); tb_printf("\n"); } break; //CDATA node event, for example:<! CDATA[data]> case TB_XML_READER_EVENT_CDATA:  { tb_size_t t = tb_xml_reader_level(reader); while (t--) tb_printf("\t"); tb_printf("<![CDATA[%s]]>", tb_xml_reader_cdata(reader)); tb_printf("\n"); } break; //Comment node events, for example:<-- comment --> case TB_XML_READER_EVENT_COMMENT:  { tb_size_t t = tb_xml_reader_level(reader); while (t--) tb_printf("\t"); tb_printf("<!--%s-->", tb_xml_reader_comment(reader)); tb_printf("\n"); } break; default: break; } } //Release Reader tb_xml_reader_exit(reader); } } //Release stream tb_stream_exit(istream); }

If you want to parse specifically, you can locate the specified path through tb_xml_reader_goto to start parsing:

 //Initialize Stream tb_stream_ref_t istream = tb_stream_init_from_url(" http://localhost/file.xml "); if (istream) { //Open Stream if (tb_stream_open(istream)) { //Initialize Reader tb_xml_reader_ref_t reader = tb_xml_reader_init(istream); if (reader) { //Jump the reader to the specified path if (tb_xml_reader_goto(reader, "/root/node/data")) { //Load data, root is the root node tb_xml_node_t* root = tb_xml_reader_load(reader); if (root)  { //Resolve node operation // ... //Release root node tb_xml_node_exit(root); } } //Release Reader tb_xml_reader_exit(reader); } } //Release stream tb_stream_exit(istream); }

The tb_xml_node_t node type is actually a tree linked list. If you load the entire object tree at one time, you can easily traverse it:

 //Node type definition description. All other nodes inherit this node typedef struct __tb_xml_node_t { ///Type of node tb_size_t                    type; ///Name of the node tb_pstring_t                 name; ///Data of nodes tb_pstring_t                 data; ///Next node, single linked list struct __tb_xml_node_t*      next; //Head of child node, single linked list struct __tb_xml_node_t*      chead; //Tail of child node struct __tb_xml_node_t*      ctail; //Number of child nodes tb_size_t                    csize; //Head of attribute node, single linked list struct __tb_xml_node_t*      ahead; //Tail of attribute node struct __tb_xml_node_t*      atail; //Number of attribute nodes tb_size_t                    asize; ///Parent node struct __tb_xml_node_t*      parent; }tb_xml_node_t;

Traverse all child nodes:

 tb_xml_node_t* head = node->chead; for (node = head;  node; node = node->next) { //Only element nodes are processed here:<element></ Element>or<element/> if (node->type == TB_XML_NODE_TYPE_ELEMENT) { //Name size of element node tb_size_t m = tb_pstring_size(&node->name); //Print Element Node Name Child tb_trace_d("%s", tb_pstring_cstr(&node->name)); }

}

Traverse all attribute nodes:

 tb_xml_node_t* head = node->ahead; for (node = head;  node; node = node->next) { //Print the name and data of the attribute node, for example: attr_name="data" tb_trace_d("%s=\"%s\"", tb_pstring_cstr(&node->name), tb_pstring_cstr(&node->data)); }

Xiao Xu Middle aged 2024-06-01 07:03

good

oldpig 2024-04-28 09:59

”Huawei contributed all the source code "?, the title is completely inconsistent with the content.

Love to eat raw pears 2024-06-01 19:18

Don't expect programmers to have a deep understanding of the document. I still think that since the tool hides the details of $#, some necessary security checks are necessary. Many people do not use MybatisPlus directly, but use various so-called rapid development platforms. The MyBatisPlus rapid development platform Snowy, Guns, etc., has an impression that many versions have the problem of using Wrapper directly to splice the Request parameter. I remember that JeecgBoot was opened a lot of CVEs last year or the year before last because of the Wrapper splicing problem. Do you know the author of ibeetl? Many CVE blaming holes have been opened before. The problem is similar. The lack of basic knowledge "script editing permission" is actively handed over to the front end. What a low-level error or even low-energy behavior. However, I accepted it with an open mind and added a white list check.

Small and beautiful software development 2024-06-01 05:06

Cheat one's job

Voice of God 2024-06-01 20:47

By default, injection ($) and splicing are turned off. If you want to use it, you need to sign the birth and death form and press the fingerprint.

xiaoqibabby 2024-05-15 17:36

The bank is strongly required to be responsible for

CodeDoger 2024-05-02 20:48

35 It's too old to go to work and too early to retire at 60

Ning Jinnong 2024-06-01 21:04

Correct it. The example of loading the library is wrong. It should be # library=@ loading the dynamic library, "./yards to the treasurer. dll"

One code Yma 2024-05-09 09:58

Recently, I often go to interviews. People who hate Ali background most regard me as a fool, even though I am a fool

Bright 2024-05-19 23:25

What a fool! I killed myself. How can people deal with me later.

osc_25732934 2024-06-01 19:30

It seems that the current version of the Foreign Function&Memory API is not as fast as that of jni, or even worse. In addition, before vallhala comes out, all interactions between java and c have to get an additional memory. Even if it comes out, it may not be possible to directly throw a copy of binary data into memory as a structure. When the two apis are completely stable, the day lily is cold

Yoona520 2024-05-17 16:34

Zhou Hongyi is now living more and more like a clown. If he stays behind the scenes, he has to become an online celebrity. Can you learn from Lei Jun?

young crops 2024-06-01 16:21

There is no tipping point. There are also many official documents stating that SQL fragments involving direct string splicing need to be controlled by the user, and specific solutions are also provided. If you say that the value part is injected, then we are also 100% free of any dispute. This obvious SQL fragment is unrealistic for ORM to explain without your control, Since SQL allows splicing fragments, there must be some scenarios that cannot be forced into non SQL strings. It is also very simple. Have you ever thought about why not force them???

Monkeys think of apes 2024-05-31 18:31

You can cheat your brother. Just don't cheat yourself

Shen Lang Panda 2024-06-01 08:16

You can directly ask questions in the project work order. The comment area is not suitable for answering such questions

Dogo_Little People 2024-06-02 12:24

Not everyone will go to see the document in full detail. As a general basic framework, the method naming should consider not only readability but also understandability. At least, it should also establish a cognition for developers. LambdaQueryWrapper is recommended. The official only briefly said that QueryWrapper may lead to SQL injection risks, There are no detailed examples (many people don't understand what SQL injection is). Now I met a jerk and submitted it to CVE to see who is the most powerful

Code craftsman 2024-06-01 11:22

I also said "user controllable parameters"

zzeric 2024-04-28 20:01

Although France is the parent community, the core developers of OCCT on github are all Russians. Without Russians, the French parent community cannot continue to operate. So Huawei took over, moved to China, changed its name and resumed open source and community operations. What's the problem?

GDWhisperer 2024-05-15 17:23

I transferred tens of thousands of yuan to my own account, which was under risk control. How did I do this? The bank should be responsible for this**

Simple code 2024-06-02 20:15

Does JBoot solve the problem that the join template in JFinal only supports Java 8? Is the dependency on Javax to be changed to Jakarta?

Love to eat raw pears 2024-06-01 11:48

Why is this so-called "vulnerability" not a vulnerability? Spring, MyBatis and other frameworks can accept all kinds of CVE criticism, while MyBatisPlus has to dump the pot and accuse programmers of being too low-level# There is a difference. The premise is that you write XML, MyBatisPlus encapsulates Wrapper and claims to simplify code. Since it encapsulates and hides $#, it is not appropriate to do some necessary security checks? Instead of doubting the authority of CVE, you should know that SQL ->MyBatis ->MyBatisPlus ->various back-end scaffolds have multiple layers, each layer is simplifying, and each layer is throwing away the upper layer of the boiler. Who dares to use them. The programmers who use MyBatisPlus can't be expected to be at a high level. Every programmer wants to save effort. The front-end parameters can be directly obtained by HttpServletRequest from the back-end. Wrapper splicing can be found everywhere. If something goes wrong, is it the front-end or the framework? According to Qingmiao, can the injection vulnerability of the previous log4j and the deletion vulnerability of the Druid be used to eliminate low-level programmers?

One code Yma 2024-05-06 09:14

My technical article was moved by CSDN. Why didn't anyone step on the sewing machine? This kind of report is a joke to me. The monsters with background are fine, and the monsters without background fight to death

Apizza 2024-06-01 17:52

You can switch from lodash to radash in 2024!!!

osc_566335 2024-04-28 14:44

This is also called floor washing? Does it mean that Tesla will not wash the floor if it releases all the source code? Some people HWptds? That is to say, the language is ambiguous, which will also rise to the washing ground? Are some people too focused? Think the people he pays attention to must be staring at?

sunday12345 2024-05-15 18:31

What does the bank do? It's blamed on the remote desktop. Persimmons really pick up soft pinches~?

Francesca 2024-05-19 18:00

Wine runs the Android emulator of Windows. Chrome OS is installed in the Android emulator. Linux environment is installed in chrome OS. Linux environment is installed in the Linux environment. Wine is installed in the Android emulator

Happy LeapFrog 2024-05-18 09:18

But the question is: "What's the use of this for ordinary Android users?" Now the answer seems to be: "Almost nothing.".

kakai 2024-05-10 10:21

The world only knows that Android was created by Google. Several people know that Android is only a product acquired by Google. Similarly, what is the problem with Huawei's contribution to the collection of OGG open source work and integration into its own proprietary product line?

Shuimu Yi'an 2024-05-20 09:58

The news should be read continuously. I'm waiting for the third news besides rustdesk and teamviewer. Localized remote desktop software is far ahead.

The seven in one little King Kong 2024-06-02 15:54

Those people only use resources, others are not developed by NPM...

MrChen89 2024-04-29 09:18

There are a group of people like this. I don't know what they have experienced. When it comes to HW, I can't say anything good, even if it's neutral

People are addicted to food 2024-06-01 13:53

History history combination

jalena 2024-05-31 23:57

I can imagine that I will also receive the CVE repair request next week..... I don't use the key!!!!!!!!!

Hakuna 2024-05-31 18:28

It is compatible with Oracle, but does not know "just" or "just". Those who can be compatible with Oracle and do well are real men and real warriors. You should know that compatibility means that even bugs must be compatible, and you have no other code that can not be copied. It's all based on real skills and understanding of oracle.

-SORA- 2024-06-01 09:30

American characters

Brother Xiao Yang 2024-06-01 20:39

Isn't Ali developed? What are you afraid of? There's no need for every family to set up a set

looly 2024-06-02 14:32

@Qingmiao Hutool has also been mentioned some loopholes that I think are relatively "low-level", or I think are not loopholes. At first, I was also very angry, but after thinking it through, I found that CVE's idea was that once you did not actively remind users that there was a pit, the user fell into the pit is your fault, that is, your vulnerability. For example, as a traffic policeman, you should remind everyone who crosses the road to pay attention to safety, and ask him to answer whether he knows. Once you don't remind someone and are hit by a car, you can't get away from it. Similarly, when using frameworks and tools, you should provide at least one parameter to remind users that there may be SQL injection vulnerabilities. Note that it is not in the comments, but in the method parameters, which is the user's responsibility. Therefore, it is not comprehensive to provide solutions in comments or documents.

Bright Stars 2 2024-05-31 23:28

Remove Unsafe? You don't want netty anymore?

Ma Nong Little Fatty Brother 2024-05-16 14:40

I give you six seconds. I give you six moves with the same effect in the martial arts contest, which shows the invincibility and confidence of the master

monkey_cici 2024-05-09 00:25

My I9 CPU, 64GB memory module and 3080Ti computer are inferior to the top configuration of 19999 on a tablet

osc_92224065 2024-04-29 10:57

Long term oppressed outsourcing of state-owned enterprises

haol666 2024-05-31 18:56

This story is powerful, I take it seriously, until I see the end.

kangaroo 2024-06-01 22:23

The next version focuses on improving existing functions * improving internal power and qi * and continues to move towards the goal of Grand Master.

infoworld 2024-05-11 15:12

Universities should use open source free software instead of commercial ones. In this way, hands and feet will not be tied technically.

Li Yinghui 2024-05-09 16:40

Buddhism has a good word, evil opinion. In dealing with the world, it is meaningless to draw conclusions from preset positions; It is also important to receive good logic training.

Chief taxi captain 2024-05-17 11:17

I suggest that 360 open source all its products, and then become the leading enterprise in the domestic open source industry through open source, leading everyone to compete with foreign enterprises

Single structure 2024-05-11 10:09

Selected as Open Source China's disgrace pillar

zhy 2024-05-16 13:16

At the end of Shannon is Nong

Rocket ship 2024-05-31 19:22

It's a ghost anyway.

Starry Night Destiny 2024-06-01 21:49

It feels like Mybatis. It's OK to provide users with optional security solutions. It's useless for users to complain about this problem

-SORA- 2024-04-30 17:07

When this happened in a foreign country, the comment area suddenly became very objective and rational**

Xiao Xu Middle aged 2024-06-01 06:49

thank

Yokesily 2024-06-02 15:11

So designed

Xiao Xu Middle aged 2024-05-31 19:13

Very good

gamedot 2024-05-17 11:14

Old Zhou is deeply concerned about Huawei's great cause of open source. He is not a Huawei person, but has Huawei's soul.

Yeah, for 2024-05-17 13:42

That's too right. Old Zhou can't control Google, but he can control 360. Do not do to others what you do not want. All 360 products should be opened first.

Qin Liming 2024-05-11 09:12

be devoid of any sense of shame

All the way north GP 2024-04-25 14:55

America, the future of mankind

sweet potato chips 2024-05-31 22:08

Glue code consumes few resources

Hot content

Popular comments of the whole site

About the author

Author's Album

Author's other popular articles

Hot News

Hot software

OSCHINA Community

Online tools

Introduction

QQ group

Public account

Video number

XML parsing

Hot content

Popular comments of the whole site

About the author

Author's Album

Author's other popular articles

Hot News

Recommended attention

Hot software

OSCHINA Community

Online tools

Introduction

QQ group

Public account

Video number