请教把word,ppt，pdf存入blob字段中能否进行中文的全文检索-Oracle教程-爱易网页

请教把word,ppt，pdf存入blob字段中能否进行中文的全文检索

日期：2014-05-17　浏览次数：21007 次

请问把word,ppt，pdf存入blob字段中能否进行中文的全文检索
请问把word,ppt，pdf等文本文档存入blob字段中能否进行中文的全文检索？
是否必须用clob字段才行？

------解决方案--------------------
你可以加一个字段，来给这个大的字段，无法查的字段写个注释，查时就查这个注释字段
------解决方案--------------------
下面是如何检索XML文档的例子
InterMedia Text 支持索引XML文档通过指定区段组。区段组就是XML文档中预先定义的节点.你可以用WithIn在指定检索某个节点，提高了检索的准确性。

1) 首先,创建一个表来存储我们的XML文档:

CREATE TABLE employee_xml(

id NUMBER PRIMARY KEY,

xmldoc CLOB )

/

2) 插入一个简单的文档(the DTD is not required)
INSERT INTO employee_xml
VALUES (1,
'<?xml version="1.0"?>
<!DOCTYPE employee [
<!ELEMENT employee (Name, Dept, Title)>
<!ELEMENT Name (#PCDATA)>
<!ELEMENT Dept (#PCDATA)>
<!ELEMENT Title (#PCDATA)>
]>
<employee>
<Name>Joel Kallman</Name>
<Dept>Oracle Service Industries Technology Group</Dept>
<Title>Technologist</Title>
</employee>');

3)创建一个叫'xmlgroup'的interMedia Text section group , 添加 Name和Dept tag到section group中。(Caution: in XML, tag names are case-sensitive, but
tag names in section groups are case-insensitive)
BEGIN
ctx_ddl.create_section_group ('xmlgroup', 'XML_SECTION_GROUP');
ctx_ddl.add_zone_section ('xmlgroup', 'Name', 'Name');
ctx_ddl.add_zone_section ('xmlgroup', 'Dept', 'Dept');
END;

4)Create our interMedia Text index, specifying the section group we created above.
Also, specify the null_filter, as the Inso filter is not required.

CREATE INDEX employee_xml_index
ON employee_xml( xmldoc )
INDEXTYPE IS ctxsys.CONTEXT PARAMETERS(
'filter ctxsys.null_filter section group xmlgroup' )
/

5) 现在,执行一个查询,搜寻特定Section中的Name:
SELECT id
FROM employee_xml
WHERE contains (xmldoc, 'Joel within Name') > 0;

6)Only non-empty tags will be indexed, but not the tag names themselves.
Thus, the following queries will return zero rows.
SELECT id
FROM employee_xml
WHERE contains (xmldoc, 'title') > 0;
SELECT id
FROM employee_xml
WHERE contains (xmldoc, 'employee') > 0;

7) But the following query will locate our document, even though we have not defined
Title as a section.
SELECT id
FROM employee_xml
WHERE contains (xmldoc, 'Technologist') > 0;

Let's say you want to get going right away with indexing XML, and don't want to have to specify sections for every element in your XML document collection. You can do this very easily by using the predefined AUTO_SECTION_GROUP. This section group is exactly like the XML section group, but the pre-definition of sections is not required. For all non-empty tags in your document, a zone section will be created with the section name the same as the tag name.

Use of the AUTO_SECTION_GROUP is also ideal when you may not know in advance all of the tag names that will be a part of your XML document set.

8) Drop our existing interMedia Text index.

9)And this time, recreate it specifying the AUTO_SECTION_GROUP.
We do not need to predefine the sections of our group, it is handled for us Automatically.

DROP INDEX employee_xml_index
/

CREATE INDEX employee_xml_index ON employee_xml( xmldoc )
INDEXTYPE IS ctxsys.CONTEXT PARAMETERS( 'filter ctxsys.null_filter section group ctxsys.auto_section_group' )

10) 再一次,我们使用Section查找定位我们的文档:
SELECT id
FROM employee_xml
WHERE contains (xmldoc, 'Technologist within Title') > 0;

具体请参考：http://epub.itpub.net/4/1.htm

免责声明： 本文仅代表作者个人观点，与爱易网无关。其原创性以及文中陈述文字和内容未经本站证实，对本文以及其中全部或者部分内容、文字的真实性、完整性、及时性本站不作任何保证或承诺，请读者仅作参考，并请自行核实相关内容。

请教把word,ppt，pdf存入blob字段中能否进行中文的全文检索

相关资料更多>

推荐阅读更多>