Help and examples
Introduction
PARES HTR search engine, thats funded by Next Generation UE Plan: The Resilience and Recovery Mechanism, combines its efforts with the States Archives ones during decades to describe documentary heritage which is conserved in its centre and to offer them to citizens through PARES. Apart from traditional search engine, citizens have at their disposal a new search tool to access directly to textual content of five documentary series using AI techniques whose success has being proved in others European national archives.
The main aim of this guide is to show how to consult /make inquiries at the search engine and which are the different options allowed.
Examples of general inquiries
Single words
pan vino trigo cebada mercader alcabala nao navio isla islas tierra condado provincia reino herencia testamento caballero hombre marinero esclavo deudo moro mora gloria delito padre hermano hija hijos infanta infantes marido mujer biznieto Segovia Badajoz Logroño Madrid Murcia Valencia Mallorca Vitoria Granada Albaicin Gibraltar Alpujarras Bobadilla Montefrio Castilla Aragon Napoles Sicilia España Portugal Francia Paris Indias Andres Beatriz Carlos Diego Catalina Cristobal Enrique Hurtado Calderon Galindo Mendoza Navarro Pacheco Ramirez Fernandez Villegas
Wild card and approximate spelling ("?", "*", "~")
?ernandez hij? hij?s ?u?iere Alpujarra* Enrique* arma* fanegada* salina* infant* alcabala* ?adre* ?u?ier* hijodalgo~ sobrino~ sacristan~2 conquista~2
Combined inquiries: Phrases sequence words ("[sequence- words]")
- [ pan y vino ]
- [ siete mil maravedies ]
- [ quinientos catorce años ]
- [ mil cuatrocientos noventa y ocho años ]
- [ fieles cristianos ]
- [ cristiano* viejo* ]
- [ santa fe catolica ]
- [ villa de Tordesillas ]
- [ villa de Cabra ]
- [ ciudad de Ronda ]
- [ tierra firme ]
- [ islas de Canaria ]
- [ isla de San Juan ]
- [ Juan Gomez de Vedoya ]
- [ Beatriz Galindo ]
- [ Rey don Enrique ]
- [ Hurtado de Mendoza ]
- [ Francisco Ramirez de Madrid ]
- [ Andres Calderon ]
Combined inquiries: OR ("||")
- alcalde || alcaide
- nao || naos
- doscientos || trescientos
- abril || mayo || junio || julio
- [que santa gloria haya] || [que haya santa gloria]
Combined inquiries: AND or blank space
- guerra && paz
- madre padre
- inquisicion pesquisa
- hija hijo hijas hijos
- conde && Barcelona
- enero && [quinientos cuatro]
- [Francisco Ramirez de Madrid] [Beatriz Galindo]
Combined inquiries: AND with approximate (&percentage&)
- señorío &25& provincia
- matrimonio &20& hijos~
- herencia &15& testamento
- conde* &10& Barcelona
- enemigos &30& [santa fe catolica] &20& moros
Combined inquiries: NO ("-")
- puerto - ciudad - villa
- hereda* - heredar
- sobrino~ - sobrino
- alcalde~ - alcalde*
- arma* - arma - armas
- Hernandez~2 - Hernandez - Hernando - Fernando - Fernandez
Combined inquiries: combination of operators (AND, OR, etc.)
- [(isla||reino) de Sicilia]
- (enero || junio) && [quinientos cuatro]
- (mayo || junio || julio) &15& [quinientos (once || doce)]
- [((mil quinientos) || quinientos) catorce]
Examples of better use of confidence index
- [islas de canaria~2] --- 40%
- [Beatriz Galindo] --- 25%
- [Villa de Talavera] --- 25%
- [principe Carlos] --- 20%
- navio* --- 20%
- Diego Pacheco --- 15%
- Puertorrico || [Puerto Rico] --- 15%
- Puertorrico [Isla de San Juan] [Sancho de Arango] --- 15%
- [tierra firme] && [mar oceano~] --- 5%
-------------------------------------------------------------------------------
Inquires, confidence index and highest search result
At the top of page, you can observe a text box where you write words and phrases that you want to find in indexed manuscripts.
Under the text box there is another one to introduce confidence index and a sliding control about this parameter. Those ones allow to state the degree of confidence, between 1 and 100, that you want to carry out research.
If the confidence index is high, results will be less but probably more accurate. However, if the confidence index is al lees number, the results will be more but with some mistakes in recover words.
It is possible, and convenient, to use low confidence indexes to look for long words and/or sequences or conjunctions words Y" (&&) (see below).
Furthermore, another text box where you can indicate the maximum number of results you want to view.
Search and presentation of results
For doing a search, you should set the wanted confidence index and the maximum number of results that you want to view. Introduce the search into enabled text box and press search button.
Results are submitted in three hierarchic different levels: SERIE, INSTALLATION UNIT, or PICTURE (page). It is easier open a new tab with results each time you move across levels. In this way, you ensure yourself not to lose search results.
To review search results-serie level
Once you make research, the system shows results into serie level. There are two indicators that displays the number of installation units of each sub-serie which contains relevant words or phrases.
To review search results-installation unit level
Press on one of the sub-series and you will browse installation unit level of this sub-serie in an isolated way. It is showed for this sub-serie installation units which contain wanted terms or phrases and the number of pages of each installation unit thar have coincidence.
To review search result-picture level
When you click on a miniature of the installation unit, interface shows you a manuscript picture list.
For each entrance of pictures list, it is mentioned number of pages, a miniature of this manuscript page, a bar where shows the confidence of results and number of words that match withs specific inquires terms. Interface shows exactly confidence values for each page as a list of percentages when you move your cursor.
Also, when you put the pointer on a picture miniature the interface reveals precise name of this file.
You get in this page clicking on the manuscript picture miniature.
In the page you will see highlighted search results (called spots). Colour box around the relevant word shows confidence level: green is maximum level and red the low.
To realise new research
You can start new research any time clicking terms research in the text box enabled for this. If you are watching a installation unit or a particular page, system will realise research only in this installation unit/page.
To realize research in all series you should go to main page. You can go clicking in HOME located on top manuscript pictures or clicking in picture research that is in the upper-left corner of web page.
To realise research in a specific installation unit, you must browse though this installation unit, select it clicking in its miniature and introduce your consult in text box as we have said.
Advanced search
There is a wide range of search that allow to achieve more specific results and to obtain more level of coverage or precision.
Punctuation mark
It is ignored any punctuation mark that appear in search text.
Symbols acting as search operators, as wild cards, Boolean operators, parentheses and brackets, are an exception.
Spelling and transliteration
Identified words in this series pictures have been indexed using simplified spelling to make flexible spelling in enquiries and to improve results effective. Differences between capitals and small letters, accents and diacritic marks have been deleted. For example, Guzmán is transformed into GUZMAN, or Sigüenza into SIGUENZA. Also, every special character is transliterated, for example, instead of veçino, CONÇEJO, eçeptuar y astræa, we should use VECINO, CONSEJO, EXCEPTUAR y ASTRAEA. However, letter ñ is not transliterated. It is right to use SEÑOR , not SENOR, SEGNOR, or SENYOR.
For this reason, enquiries may be written with plain text an in-capital letters. To make easier queries writing. Systems change the text to adapt ii into simplified spelling used in indexing. In other words, you can write as you would like to, but system change each word into a simple form.
Results of applying these changes in each consult are showed in an informative message under text box produced by search result. After write Sigüenza CóRdoba ocaña, system shows this: 1 match found for "SIGUENZA CORDOBA OCAÑA" with a confidence of 60%".
Abbreviations, modernization, and dates
To look for abbreviated words (e.g. dho, Aud, a., d, Dest^o) is necessary to use expanded form (and modernization) (e.g. dicho, audiencia, años, dias, destacamento).
Wild cards and approximate spelling
- ? wild card. This symbol can be used to represent any character. For example, hij?s can be used to look for HIJOS, and HIJAS; or ?adre for padre and madre.
- * wild card. This symbol can be used to represent any sequence of characters. For example, naufrag* can be used to look for naufragio, naufrago, naufragos, naufragar, naufragado, etc.
- Approximate spelling-. Special character ~ can be added at the end of a word to look for ones with approximated spelling, in other wors, that differs in only one character of the specified spelling. For example, we can use Jimenez~ to look for notions of Jimenez, Gimenez, Ximenez, Jimenes and Jimmenez or use Jimenez~2 for find also Gimenes, Jimmenes, Ximenes, Gimmenez, Jimeno, Jimena, etc.
- Therefore, if it is used a high dissimilitude value of approximate spelling or if the number of normal character (different of ? and *) is low with wild cards, system tends to produce high amounts of results that are not so much useful. System, also, limits maximum dissimilitude value and minimum number of normal characters depending of hox we do consults in order to avoid unsuccessful them.
As a general recommendation in high level, you should use approximate search and wild card only with long words and limiting wild cards at minimum dissimilitude degree to 1. Approximate search and wild cards use involve a high computational load for servers. Our recommendation is to use it with measure.
Consults with multiple words
Individual words can be combined in composed consults in three wats: Boolean consults, approximated Y consults and sequences words consults.
- Boolean consults (Y, O, No). Y, O, No operators can be expressed using following marks:
- Y: &&. For example, aceite && vino gives you as a result pictures that contains al least one of each aceite Y vino. This operator can be left out, so aceite && vino can be written as aceite vino.
- O: || For example, if we use Alpujarras || Alhambra, it will return results where it finds 'Alpujarras Or Alhambra words
- NO: -, puts before each Word taht we dont want to find. For example, with Andalucia - Granada - Gibraltar system gives us results with Andalucia but NO Granada or Gibraltar.
- Approximated Y consults. There are Y consults with a number to specified how far you want a term.
- Sequences words consults. These are type Y consults where it is required that words are one after the other. They can be expressed as a sequence word between brackets.
Number of research results is referred to total number of wors founded that match with consults made.
It is assumed that number is a percentage of the whole picture expressed the horizontal or vertical maximum distance allowed. In order words, if you look for pleito &10& justicia, system will show pictures with pleito and justicia terms but separated between then no more thana 10% of the whole picture.
Sequences words consults are nor interpreted as exactly sections of text. They allowed appear some short extra words between each term. For example, [islas Canaria*] can give results that contain phrases as islas de Canaria, islas Canarias, islas de la Canaria, islas de las Canarias, etc.
In this casa reported number as a number of research results is the whole number is total number of times that sequences have been found.
Types of search combination
Boolean enquiries, either sequences or proximate ones, can be arbitrarily mixed to make more complex research. You can also use wild cards and approximate spelling in all these types of consults. For example, [puerto and bahía] && [ciudad de Cadiz], [ciudad de Ronda] && Andalucia, etc.
It is used parentheses () to group them: e.g., if we look for Francisco && (Fernandez || Ramirez), we obtain almost one entrance of Francisco and one of Fernanda, Ramirez or both. Likewise, [puerto de (Cadiz || Veracruz)] Will show Puerto de Cadiz, puerto de Veracruz or both.