題目:
將下面xml格式文件,按樹狀結(jié)構(gòu)進(jìn)行輸出,格式如書本目錄梗掰,條件是不能用第三方庫嵌言,比如什么dom,sax,pull解析方式及穗,一句話摧茴,把里面的節(jié)點(diǎn)自己慢點(diǎn)摳出來吧。
(純屬自娛自樂拥坛,有問題的地方或者有更好的想法蓬蝶,歡迎交流,謝謝,qq:413686520)
<people >
<person id="001">
<name>XY1</name>
<age>22</age>
</person>
<name key1="fsefsef" key2="gdrhr" />
<person id="002">
<name>XY2</name>
<age>22</age>
<grade>
<math>98</math>
<english>100</english>
<music>28</music>
</grade>
</person>
<wenzhang key1="fsefsef" key2="gdrhr" />
<life key1="fsefsef" key2="gdrhr" />
</people>
對(duì)于這個(gè)題目猜惋,最開始我對(duì)出題人的要求搞錯(cuò)了,以為只要節(jié)點(diǎn)名字培愁,節(jié)點(diǎn)深度著摔,最后輸出成目錄結(jié)構(gòu)就行。所以定续,我就悲劇了谍咆,我最開始是用in.read()一個(gè)一個(gè)讀到內(nèi)存中,按字符來解析的私股,可以想象一下最后是多么艱難的寫出來的摹察,花了4個(gè)小時(shí),可能時(shí)間更長倡鲸,早上沒有吃飯供嚎,寫完的時(shí)候,已經(jīng)餓的肚子疼了峭状。一個(gè)字符一個(gè)字符的解析克滴,竟然讓我搞出來了。最開始的想法決定了算法最終的難度优床。下面給出這兩天重新寫過后的結(jié)果:
其中劝赔,我們拿grade節(jié)點(diǎn)說明一下,它的節(jié)點(diǎn)深度為3胆敞,它有孩子節(jié)點(diǎn)mat,h着帽,english,music,其他的節(jié)點(diǎn)可能還存在key:value鍵值對(duì)等等(直接看圖吧)。
思路:
首先移层,一次讀取一行仍翰,并且從讀取的內(nèi)容中找到第一個(gè)右括號(hào)‘>’,這樣我們最后需要處理的內(nèi)容都會(huì)變?yōu)槿缦聨追N形式:
(1).***<******>
(2).***<******/>
(3).***</******>
注意:先規(guī)定下本文節(jié)點(diǎn)的格式
例如
對(duì)于上面的三種狀態(tài)幽钢,對(duì)于‘<’左邊的只可能是節(jié)點(diǎn)的text文本值歉备,然后對(duì)于“<>”,里面只可能是節(jié)點(diǎn)名字匪燕,鍵值對(duì)蕾羊,節(jié)點(diǎn)結(jié)束標(biāo)志喧笔。
首先構(gòu)建我們的節(jié)點(diǎn),如下所示:
package com.lifestudy.stdy.lifestudy.readxml;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
/**
* Created by lj on 2017/6/16.
*/
public class LjXMLNode {
int id;//唯一標(biāo)識(shí)
String name;//節(jié)點(diǎn)的名字
List<Integer> children = new ArrayList();//當(dāng)前節(jié)點(diǎn)的孩子節(jié)點(diǎn)
int parent = -1;//默認(rèn)父節(jié)點(diǎn)龟再,用id表示
Map<String,String> mapKeyValue = new HashMap();//當(dāng)前節(jié)點(diǎn)的key:value
String text="";//當(dāng)前節(jié)點(diǎn)的文本值
int level;//當(dāng)前節(jié)點(diǎn)的深度
}
然后书闸,開始開始遍歷我們的xml文件:
public void readXMLFile(InputStream inputStream){
try {
InputStreamReader inputStreamReader = new InputStreamReader(putStream);
BufferedReader in = new BufferedReader(inputStreamReader);
System.out.println("開始輸入:");
StringBuffer sb = new StringBuffer();
String s;
String rest = "";
while( ( s = in.readLine() ) != null ){
s = s.trim();
s = rest + s;
int start = 0;
for( int i = 0; i < s.length(); i++ ){
if( s.charAt(i) == '>'){
process(s.substring(start,i+1));
start = i + 1;
}
}
if( start >= s.length() ){//說明處理到最後一個(gè)字符了
//do nothing
}else{
rest = s.substring(start,s.length());
}
}
// System.out.println( sb.toString() );
} catch (Exception e) {
e.printStackTrace();
}
}
然后,在process()方法中處理最開始的三種情況:
說明:對(duì)于'<'左邊的字符都是節(jié)點(diǎn)的text值利凑,對(duì)于‘<’右邊的第一個(gè)單詞肯定是節(jié)點(diǎn)的name浆劲,當(dāng)我們讀取完name時(shí),我們創(chuàng)建一個(gè)LjXMLNode節(jié)點(diǎn)node哀澈,賦值node.name牌借,最后,入棧的并將其父節(jié)點(diǎn)node.parent指向棧頂元素割按,同時(shí)讓棧頂元素的孩子節(jié)點(diǎn)加1膨报。然后是讀取節(jié)點(diǎn)的key:value(這個(gè)可以根據(jù)是否有等號(hào)來判斷),并且更新node的值。最后當(dāng)前節(jié)點(diǎn)只可能以右括號(hào)'>'或者'/>'兩種形式結(jié)尾适荣,當(dāng)我們讀取到“/”時(shí)现柠,將當(dāng)前節(jié)點(diǎn)出棧(同時(shí),如果想對(duì)節(jié)點(diǎn)做操作弛矛,也是在出棧時(shí)够吩,因?yàn)槌鰲5墓?jié)點(diǎn)元素是完整的,它肯定是遍歷完成了丈氓,本文是將出棧的元素保存到一個(gè)map中了周循,其中key是id,value是LjXMLNode)。
/**
* 處理
* 主要是對(duì)stack的操作
*/
private void process(String s){
String nodeName = "";
String text = "";
for( int i = 0; i < s.length();i++ ){
if( s.charAt(i) == '<'){
text = s.substring(0,i);
if( mStack.size() > 0){
LjXMLNode topNode = mStack.peek();
topNode.text = text;
}
if( s.charAt(i+1) != '/'){
nodeName = findNodeName(s.substring(i+1));
//push
LjXMLNode node = new LjXMLNode();
node.id = ID_0++;
node.name = nodeName;
if( mStack.size() > 0){
LjXMLNode parentNode = mStack.peek();
node.parent = parentNode.id;
parentNode.children.add(node.id);
}
mStack.push(node);
for( int j = i+1; j < s.length();j++ ){
if( s.charAt(j) == '='){//key:value
String key = findFirstLeftWord(s,j);
String value = findFirstRightWord(s,j);
if( key != ""){
LjXMLNode curNode = mStack.peek();
curNode.mapKeyValue.put(key,value);
}
}
}
for( int j = i+2; j < s.length();j++ ){
if( s.charAt(j) == '/'){
LjXMLNode shuchuNode = mStack.pop();
shuchuNode.level = mStack.size() + 1;
mFinalNodes.put(shuchuNode.id,shuchuNode);
System.out.println("shuchuNode:" + "id:"+shuchuNode.id + " name:"+shuchuNode.name
+ " key_count:"+shuchuNode.mapKeyValue.size()
+ " text:" + shuchuNode.text
+" parent:"+ shuchuNode.parent);
}
}
}else {
//pop
LjXMLNode shuchuNode = mStack.pop();
shuchuNode.level = mStack.size() + 1;
mFinalNodes.put(shuchuNode.id,shuchuNode);
System.out.println("shuchuNode:" + "id:"+shuchuNode.id + " name:"+shuchuNode.name
+ " key_count:"+shuchuNode.mapKeyValue.size()
+ " text:" + shuchuNode.text
+" parent:"+ shuchuNode.parent);
}
break;
}
}
}
最后是對(duì)節(jié)點(diǎn)的遍歷扒寄,因?yàn)樘幚砗蟮墓?jié)點(diǎn)完沪,最后會(huì)存儲(chǔ)為成森林的樹狀結(jié)構(gòu)然走,如下所示:
所以疆前,這塊采用樹的深度優(yōu)先遍歷算法认境。思想很簡單,構(gòu)建一個(gè)棧結(jié)構(gòu)课竣,首先將根節(jié)點(diǎn)入棧嘉赎,也就是圖中的A節(jié)點(diǎn),然后開始遍歷于樟,先將根節(jié)點(diǎn)A出棧公条,同時(shí)訪問根節(jié)點(diǎn)A,并且將其孩子節(jié)點(diǎn)入棧(注意此時(shí)的入棧順序是從最右邊的孩子開始往左遍歷,也就是說越在左邊的孩子應(yīng)該越早訪問到)迂曲。后面仿照根節(jié)點(diǎn)A出棧的同時(shí)訪問靶橱,并且將其孩子節(jié)點(diǎn)入棧,代碼如下所示:
public void depthLjFirst(){
Stack<LjXMLNode> depthStack = new Stack();
LjXMLNode head = mFinalNodes.get(0);
depthStack.push(head);
while ( depthStack.size()> 0 ){
LjXMLNode cur = depthStack.pop();
visit(cur);
for( int i = cur.children.size(); i > 0;i-- ){
int childId = cur.children.get(i-1);
depthStack.push( mFinalNodes.get(childId) );
}
}
}
最后完整的代碼如下所示:
package com.lifestudy.stdy.lifestudy.readxml;
import java.io.BufferedReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Set;
import java.util.Stack;
/**
* Created by lj on 2017/6/16.
*/
public class LjXMLParser {
private Stack<LjXMLNode> mStack = new Stack();
Map<Integer,LjXMLNode> mFinalNodes = new HashMap();
private int ID_0 = 0;
public void readXMLFile(InputStream in1){
try {
InputStreamReader inputStreamReader = new InputStreamReader(in1);
BufferedReader in = new BufferedReader(inputStreamReader);
System.out.println("開始輸入:");
StringBuffer sb = new StringBuffer();
String s;
String rest = "";
while( ( s = in.readLine() ) != null ){
s = s.trim();
s = rest + s;
int start = 0;
for( int i = 0; i < s.length(); i++ ){
if( s.charAt(i) == '>'){
process(s.substring(start,i+1));
start = i + 1;
}
}
if( start >= s.length() ){//說明處理到最後一個(gè)字符了
//do nothing
}else{
rest = s.substring(start,s.length());
}
}
// System.out.println( sb.toString() );
} catch (Exception e) {
e.printStackTrace();
}
}
/**
* 處理
* 主要是對(duì)stack的操作
*/
private void process(String s){
String nodeName = "";
String text = "";
for( int i = 0; i < s.length();i++ ){
if( s.charAt(i) == '<'){
text = s.substring(0,i);
if( mStack.size() > 0){
LjXMLNode topNode = mStack.peek();
topNode.text = text;
}
if( s.charAt(i+1) != '/'){
nodeName = findNodeName(s.substring(i+1));
//push
LjXMLNode node = new LjXMLNode();
node.id = ID_0++;
node.name = nodeName;
if( mStack.size() > 0){
LjXMLNode parentNode = mStack.peek();
node.parent = parentNode.id;
parentNode.children.add(node.id);
}
mStack.push(node);
for( int j = i+1; j < s.length();j++ ){
if( s.charAt(j) == '='){//key:value
String key = findFirstLeftWord(s,j);
String value = findFirstRightWord(s,j);
if( key != ""){
LjXMLNode curNode = mStack.peek();
curNode.mapKeyValue.put(key,value);
}
}
}
for( int j = i+2; j < s.length();j++ ){
if( s.charAt(j) == '/'){
LjXMLNode shuchuNode = mStack.pop();
shuchuNode.level = mStack.size() + 1;
mFinalNodes.put(shuchuNode.id,shuchuNode);
System.out.println("shuchuNode:" + "id:"+shuchuNode.id + " name:"+shuchuNode.name
+ " key_count:"+shuchuNode.mapKeyValue.size()
+ " text:" + shuchuNode.text
+" parent:"+ shuchuNode.parent);
}
}
}else {
//pop
LjXMLNode shuchuNode = mStack.pop();
shuchuNode.level = mStack.size() + 1;
mFinalNodes.put(shuchuNode.id,shuchuNode);
System.out.println("shuchuNode:" + "id:"+shuchuNode.id + " name:"+shuchuNode.name
+ " key_count:"+shuchuNode.mapKeyValue.size()
+ " text:" + shuchuNode.text
+" parent:"+ shuchuNode.parent);
}
break;
}
}
}
/**
* 找到右邊的第一個(gè)單詞,只能以空格或者是右括號(hào)“>”結(jié)尾
*/
private String findNodeName(String s){
int start,end;
end = -1;
start = -1;
for( int i = 0; i < s.length(); i++){
if( s.charAt(i) != ' '){ //找到左邊的第一個(gè)字母
start = i;
// System.out.println("nodeName_start:" + start);
break;
}
}
if( start < s.length() && start >= 0){
for( int i = start + 1; i < s.length(); i++ ){
if( s.charAt(i) == ' ' || s.charAt(i) == '>'){
end = i;
break;
}
}
}
if( start <= end && start != -1){
return s.substring(start,end);
}
return "";
}
/**
* 找到下標(biāo)為loc左邊第一個(gè)單詞
*/
private String findFirstLeftWord(String s,int loc){
// System.out.println("開始解析:" + s + "###loc:" + loc);
String key= "";
int start,end;
end = -1;
start = -1;
for( int i = loc-1; i > 0; i--){
if( s.charAt(i) != ' '){ //找到左邊的第一個(gè)字母
end = i;
// System.out.println("end:" + end);
break;
}
}
// System.out.println("end解析完成");
if( end < loc && end > 0){
for( int i = end; i > 0; i--){
if( s.charAt(i) == ' '){ //找到左邊的第一個(gè)空格
start = i+1;
// System.out.println("start:" + start);
break;
}
}
}
if( start <= end && start != -1){
return s.substring(start,end+1);
}
return "";
}
/**
* 找到loc右邊第一個(gè)雙引號(hào)包括的單詞
*/
private String findFirstRightWord(String s,int loc){
// System.out.println("s:" + s);
String value= "";
int start,end;
end = -1;
start = -1;
for( int i = loc + 1; i < s.length(); i++){
if( s.charAt(i) == '"'){ //找到右邊的第一個(gè)雙引號(hào)
start = i;
// System.out.println("start:" + start);
break;
}
}
if( start > loc && start < s.length()){
for( int i = start+1; i < s.length(); i++){
if( s.charAt(i) == '"'){ //找到右邊的第一個(gè)雙引號(hào)
end = i+1;
// System.out.println("end:" + end);
break;
}
}
}
if( start <= end && start != -1){
return s.substring(start,end);
}
return "";
}
public void depthLjFirst(){
Stack<LjXMLNode> depthStack = new Stack();
LjXMLNode head = mFinalNodes.get(0);
depthStack.push(head);
while ( depthStack.size()> 0 ){
LjXMLNode cur = depthStack.pop();
visit(cur);
for( int i = cur.children.size(); i > 0;i-- ){
int childId = cur.children.get(i-1);
depthStack.push( mFinalNodes.get(childId) );
}
}
}
private void visit(LjXMLNode node){
StringBuffer sb = new StringBuffer();
for( int i = 0; i < node.level; i++){
sb.append(" ");//三個(gè)空格
}
sb.append("" + node.level);
sb.append(" "+ node.name);
Set<Map.Entry<String, String>> set2=node.mapKeyValue.entrySet();
for (Iterator<Map.Entry<String, String>> iterator = set2.iterator(); iterator.hasNext();) {
Map.Entry<String, String> entry = (Map.Entry<String, String>) iterator.next();
String key=entry.getKey();
String valueString=entry.getValue();
sb.append(" "+ key + "=" + valueString );
}
// TODO: 2017/6/16 此處用“!=”來判斷text是否為空會(huì)出錯(cuò) (why?)
if( node.text != null&& !node.text.equals("") ){
sb.append(" text:" + node.text);
}
System.out.println( sb.toString());
}
}