LINGUIST List 16.97 Fri Jan 14 2005 Review: Pragmatics/Discourse Analysis: Lennon (2004) Editor for this issue: Megan Zdrojkowski.

  1. Linguist 1 97 Fleer
  2. Linguist 167 Stanford

Wed Aug 6 1997

Confs: TEI, COCOSDA-97

Editor for this issue: Martin Jacobsen

Linguist 1 97 Fleer


  • Nancy M. Ide, TEI Conference: REMINDER
  • Mark Liberman, COCOSDA-97 in Rhodes

    Message 1: TEI Conference: REMINDER

    Date: Tue, 5 Aug 97 10:25:30 EDT
    From: Nancy M. Ide
    Subject: TEI Conference: REMINDER
    ***** ABSTRACTS DUE AUGUST 20!!! *****
    November 14-16, 1997Brown UniversityProvidence, Rhode Island, USA
    Sponsored by
    Brown University Computing and Information ServicesBrown University Libraries
    To commemorate the tenth anniversary of its founding, the TextEncoding Initiative (TEI) is sponsoring its first user conference, tobe held 14-16 November 1997 at Brown University in Providence, RhodeIsland.
    The TEI was established at an international planning meeting on textencoding standards, held at Vassar College on November 12-13,1987. The TEI is sponsored by the Association for Computers and theHumanities, the Association for Computational Linguistics, and theAssociation for Literary and Linguistic Computing.
    The TEI Guidelines for Electronic Text Encoding and Interchange werepublished in spring of 1994. They provide an extensive SGML-basedscheme for encoding electronic texts across a wide spectrum of texttypes and suitable for any kind of application. The Guidelines havealready achieved wide-scale implementation in projects throughoutNorth America and Europe.
    The TEI conference will bring together users of the TEI Guidelines inorder to share ideas, experiences, and expertise, provide a forum fortechnical discussion and evaluation of the Guidelines as they havebeen implemented across a variety of applications. The topics includebut are not limited to:
    o reports on the use of the TEI scheme in a particular projector in a particular application area or discipline
    o reports from particular user communities such as the builders anddesigners of electronic text centers, digital libraries, languagecorpora, electronic editions, multi-media databases, etc.
    o evaluations of the TEI scheme as applied to a particular class oftexts or in a particular type of scholarly research
    o technical discussions of particular encoding problems and solutionssuch as unusual or complex text types, multi-media, multiple viewsor information types, multi-lingual data and internationalization,textual variation, overlap, etc.
    o papers on customization and extension of the TEI for particularapplication areas and text types
    o reports on experience using off-the-shelf software with TEIdocuments, or developing software to handle TEI material
    o discussions of markup theory and markup architectures, withparticular reference to the TEI
    o discussions of the TEI in the light of developments in the largercomputing community (the Web, XML, ..)
    A portion of the conference will also be devoted to consideration ofthe future of the TEI. Possible topics to be discussed include theorganization of the project, membership on the component committees,priorities, and new work items to be proposed to the Technical ReviewCommittee.
    Submissions of at least 1500 words should be sent by August 20,1997. Email submissions or a URL where the submission can be retrievedshould be sent to Submissions in TEI Lite arepreferred, but full TEI or (valid!) HTML 3.2 is acceptable. If it isnot possible to submit in one of these forms, please to make special arrangements.
    Papers should include complete references to related work and shouldclearly identify the main problem being addressed, other similarprojects and their relation to this project, the main and originalcontribution of the paper, and remaining or open problems. Authors arealso asked to indicate if this paper is or will be submittedelsewhere.
    Notification of acceptance will be made by September 20, 1997. Finalversions of full papers will be due by October 15, 1997. An electronicconference proceedings will be published; other publication detailswill be forthcoming.
    * Nancy Ide, Vassar College* C. M. Sperberg-McQueen, University of Illinois at Chicago
    * Susan Armstrong, University of Geneva* Winfried Bader, German Bible Society* David Barnard, University of Regina (Sask.)* Lou Burnard, Oxford University Computing Services* Tom Corns, University of Wales, Bangor* Steve DeRose, Inso Corp.* David Gants, University of Georgia* Dan Greenstein, King's College, London* Susan Hockey, University of Alberta* Stig Johansson, University of Oslo* Judith Klavans, Columbia University* Terry Langendoen, University of Arizona* Elli Mylonas, Brown University* John Price-Wilkin, University of Michigan* Gary Simons, Summer Institute of Linguistics* Frank Tompa, University of Waterloo* Syun Tutiya, Chiba University* Antonio Zampolli, University of Pisa
    On program and paper submissions: tei10_programstg.brown.eduAbout local arrangements:

    Message 2: COCOSDA-97 in Rhodes

    Date: Tue, 5 Aug 1997 17:20:46 -0400 (EDT)
    From: Mark Liberman
    Subject: COCOSDA-97 in Rhodes
    COCOSDA, the Coordinating Committee on Speech Databases andAssessment, was founded in 1991, and has held yearly workshops eversince.
    The 1997 COCOSDA workshop, on the theme 'Standards and Tools forLinguistic Annotation of Speech Databases,' will take place at theConvention Centre of the Rodos Palace Hotel, in Rhodes, Greece, on thetwo days following the Eurospeech meeting: Friday, September 26 andSaturday, September 27. It will be co-located with the COST workshopon 'Speech Technology in the Public Telephone Network: Where are wetoday?' held in the same facility on the same two days. Overallregistration is limited to 200, 100 from each organization.
    COCOSDA aims to promote collaborative work and information exchangefor resources and standards in Spoken Language Engineering. Itmaintains working groups on Speech Corpora and Labelling, SpeechSynthesis Assessment, and Speech Recognition Assessment. COCOSDAworkshops include reports on relevant activities around the world, anddiscussions of topics of mutual interest.
    Further information about COCOSDA can be found at the URL, and further information about COST(European Cooperation in the field of Scientific and TechnicalResearch) can be found at
    Submissions on the theme of COCOSDA'97, as well as other relevantsubjects, are invited.
    To register for COCOSDA97, see'97 attendees are welcome to attend sessions of the COSTworkshop as well, though they will have to register separately forCOST in order to get a copy of the proceedings.
    On Friday afternoon, there will be a joint COST/COCOSDA session on thetopic of Speech Recognition.
    On Friday morning and Saturday afternoon, COCOSDA'97 will meetseparately from COST. There will be both reports of general interestand presentations on the workshop theme.
    On Saturday morning, the three COCOSDA working groups (on SpeechCorpora and Labelling, Speech Synthesis Assessment, and SpeechRecognition Assessment) will meet separately, as arranged by theirindividual organizers.
    COCOSDA'97 is focused on standards and tools for linguistic annotationof speech databases. If you would like to make a presentation on theworkshop theme, or on another topic within COCOSDA's area of interest,please register for the workshop and send an abstract of 500 words orless to If possible, include a URL forpapers or project descriptions. All good-faith submissions will beaccommodated, though some may have to be placed in a poster session.
    Linguist 167 Stanford

    Books and monographs
    12017A Frequency Dictionary of Spanish: Core Vocabulary for Learners. Second edition: revised and expanded. Routledge. (Co-authored with Kathy Hayward Davies)
    22010A Frequency Dictionary of American English: Word Sketches, Collocates, and Thematic Lists. Routledge. (Co-authored with Dee Gardner.)
    32009Corpus linguistic applications: current studies, new directions. Rodopi. (Co-editor, with Stefan Gries and Stefanie Wulff.)
    42007A Frequency Dictionary of Portuguese: Core Vocabulary for Learners. Routledge. (Co-authored with Ana Maria Raposo Preto-Bay)
    52005A Frequency Dictionary of Spanish: Core Vocabulary for Learners. Routledge.
    62004El uso del Corpus del Español y otros corpus para investigar la variación actual y los cambios históricos. Tokyo: Univ. Sophia.
    Journal articles and chapters (click to download)
    72020'The TV and Movies corpora: design, construction, and use.' International Journal of Corpus Linguistics.
    82020'Constitución de corpus crecientes del español'. In Giovanni Parodi, Pascual Cantos, Chad Howe. The Routledge Handbook of Spanish Corpus Linguistics. (With Giovanni Parodi)
    92019'The advantages and challenges of ‘big data': Insights from the 14 billion word iWeb corpus'. Linguistic Research 36(1), 1-34. (With Jong-Bok Kim)
    102019'The best of both worlds: Multi-billion word ‘dynamic' corpora'. In Piotr Bański, et a. Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-7) 2019. Mannheim: Leibniz-Institut fur Deutsche Sprache.
    112019'If olive oil is made of olives, then what's baby oil made of? The shifting semantics of Noun+Noun sequences in American English.' In J. Egbert & P. Baker (Eds.), Using corpus methods to triangulate linguistic analysis, New York: Routledge. 163-84. (With Jesse Egbert)
    122019'Historical shifts with the into-causative construction in American English.' Linguistics 57: 29-58. (With Jong-Bok Kim)
    132018'Sorting them all out: Exploring the separable phrasal verbs of English.' System 76: 197-209. (With Dee Gardner)
    142018'Using (and useful) corpora for the study of the history of English'. In Teaching the History of the English Language, eds. Chris Palmer and Colette Moore. MLA Options for Teaching Series.
    152018'Corpus-based studies of lexical and semantic variation: The importance of both corpus size and corpus design.' In From data to evidence in English language research (Digital Linguistics), eds. Suhr, Carla, Terttu Nevalainen and Irma Taavitsainen. Leiden: Brill. 34-55.
    162018'Uso del Corpus del Español y los corpus relacionados para la lexicografía histórica española.' In Historia del léxico español y Humanidades digitales. Eds. Alejandro Fajardo, et al. Berlin: Peter Lang. 49-76.
    172017'Using Large Online Corpora to Examine Lexical, Semantic, and Cultural Variation in Different Dialects and Time Periods'. In Corpus-Based Sociolinguistics, ed. Eric Friginal et al. London: Routledge. 19-82.
    182016'The Effect of Representativeness and Size in Historical Corpora: An Empirical Study of Changes in Lexical Frequency.' In Studies in the History of the English Language VII: Generalizing vs. particularizing methodologies in historical linguistic analysis, eds. Don Chapman, Colette Moore, and Miranda Wilcox. Berlin: De Gruyter / Mouton. 131-50. (With Don Chapman)
    192016'The Into Causative Construction in English: A Construction-based Perspective.' English Language and Linguistics 20 (1): 55-83. (With Jong-Bok Kim)
    202016'A response to ‘To what extent is the Academic Vocabulary List relevant to university student writing?'. English for Specific Purposes 42: 62-68. (With Dee Gardner)
    212015'Corpora: An Introduction'. In Cambridge Handbook of English Corpus Linguistics, eds. Douglas Biber and Randi Reppen. Cambridge: Cambridge University Press. 11-31.
    222015'A Corpus Linguistic Approach to Vocabulary Learning for University Students.' In ESL Readers and Writers in Higher Education: Understanding Challenges, Providing Support, eds. Norm Evans, Neil Anderson, and William Eggington. London: Routledge. 180-197. (With Dee Gardner)
    232015'Introducing the 1.9 Billion Word Global Web-Based English Corpus (GloWbE).' 21st Century Text. (Peer-reviewed, online journal).
    242015'Exploring the Composition of the Web: A Corpus-based Taxonomy of Web Registers'. Corpora 10 (1): 11-45. (With Douglas Biber and Jesse Egbert)
    252015'Expanding Horizons in the Study of World Englishes with the 1.9 Billion Word Global Web-Based English Corpus (GloWbE).' English World-Wide 36: 1-28. (With Robert Fuchs)
    262015'The importance of robust corpora in providing more realistic descriptions of variation in English grammar'. In Linguistic Vanguard (peer-reviewed online journal from Mouton de Gruyter)
    272015'Developing a Bottom-up, User-based Method of Web Register Classification' in its current form for publication in Journal of the Association for Information Science and Technology. (With Douglas Biber and Jesse Egbert)
    282014'Making Google Books n-grams useful for a wide range of research on language change'. International Journal of Corpus Linguistics 19 (3): 401-16.
    292014'Powerful (yet simple) comparisons of a wide range of phenomena in British and American English'. ICAME Journal 38:35-56.
    302014'Creating and Using the Corpus do Português and the Frequency Dictionary of Portuguese'. In Working with Portuguese Corpora, eds. Tony Berber Sardinha and Telma Ferreira. Continuum Publishers. 89-110.
    312014'Examining syntactic variation in English: the importance of corpus design and corpus size'. English Language and Linguistics 19 (3): 1-35.
    322013'Google Scholar vs. COCA: two very different approaches to examining academic English'. Journal of English for Academic Purposes 12: 155-165.
    332013'A New Academic Vocabulary List.' In Applied Linguistics 35: 1-24. (With Dee Gardner)
    342013'Establishing Corpora from Existing Data Sources'. In Data Collection in Sociolinguistics: Methods and Applications, ed Christine Mallinson, et al. London: Routledge. 210-12.
    352012'Expanding Horizons in Historical Linguistics with the 400 million word Corpus of Historical American English'. Corpora 7: 121-57.
    362012'Examining Recent Changes in English: Some Methodological Issues'. In The Oxford Handbook of the History of English, eds. Terttu Nevalainen and Elizabeth Closs Traugott. Oxford: Oxford Univ. Press. 263-87.
    372012'Recent shifts with three nonfinite verbal complements in English: Data from the 100 million word TIME Corpus (1920s-2000s)'. In Current Change in the English Verb Phrase, ed. Bas Aarts, et al. Cambridge: Cambridge Univ. Press. 46-67.
    382012'The 400 Million Word Corpus of Historical American English (1810-2009)'. In English Historical Linguistics 2010, ed. Irén Hegedus, et al. Philadelphia: John Benjamins. 217-50.
    392012'Looking at Recent Changes in English with the Corpus of Contemporary American English (COCA)'. 21st Century Text. (Peer-reviewed, online journal).
    402012'Comparisons between Google Books and Google Books Corpus.' Computer-assisted Foreign Language Education. 145:15-18. (With Xingfu Wang).
    412011'Synchronic and Diachronic Uses of Corpora'. In Perspectives on Corpus Linguistics: Connections & Controversies, eds. Vander Viana, Sonia Zyngier and Geoff Barnbrook. Philadelphia: John Benjamins. 63-80.
    422011'Creating and Using the Frequency Dictionary of Contemporary American English: Word Sketches, Collocates, and Thematic Lists'. In Corpus-based studies in language use, language learning, and language documentation, ed. John Newman, et al. Amsterdam: Rodopi. 283-97.
    432011'The Corpus of Contemporary American English as the First Reliable Monitor Corpus of English'. Literary and Linguistic Computing 25: 447-65.
    442010'More than a peephole: Using large and diverse online corpora'. International Journal of Corpus Linguistics 15: 405-11.
    452010'Semantically-based, learner-oriented queries with the 400+ million word Corpus of Contemporary American English'. Łódź Studies in Language, ed. Stanislaw Goźdź-Roszkowski. Frankfurt: Peter Lang.
    462010'Creating Useful Historical Corpora: A Comparison of CORDE, the Corpus del Español, and the Corpus do Português'. In Diacronía de las lenguas iberorromances: nuevas perspectivas desde la lingüística de corpus, ed. Andrés Enrique-Arias. Frankfurt/Madrid: Vervuert/Iberoamericana. 137-66.
    472010'What students need (and want): semantically-oriented queries in large online corpora'. SYNAPS (Bergen) 24: 27-40.
    482009'The 385+ Million Word Corpus of Contemporary American English (1990-2008+): Design, Architecture, and Linguistic Insights'. International Journal of Corpus Linguistics. 14: 159-90.
    492009'Relational databases as a robust architecture for the analysis of word frequency'. In What's in a Wordlist?: In Investigating Word Frequency and Keyword Extraction, ed. Dawn Archer. London: Ashgate. 53-68.
    502008'Spanish and Portuguese Corpus Linguistics'. Studies in Hispanic and Lusophone Linguistics. 1:149-86.
    512008'The corpus-based Frequency Dictionary of Portuguese: A new tool for learners and teachers.' In Proceedings of TALC 8: Teaching and Language Corpora, ed. Ana Frankenberg-Garcia, et al. Lisbon. (Co-authored with Ana Maria Raposo Preto-Bay)
    522008'The Corpus of Contemporary American English--a Useful Tool for English Teaching and Research'. Computer-Assisted Foreign Language Education in China. 5:24-31 (Co-authored with Wang Xingfu and Liu Guohui).
    532007'Pointing Out Frequent Phrasal Verbs: A Corpus-Based Analysis'. TESOL Quarterly 41:339-59. (Co-authored with Dee Gardner)
    542007'Semantically-based queries with a joint BNC/WordNet database'. In Corpus Linguistics Twenty-five Years On, ed. Roberta Facchinetti. Amsterdam: Rodopi. 149-167.
    552006'Towards the first comprehensive survey of register variation in Spanish'. In Corpus Linguistics Beyond the Word: Corpus Research from Phrase to Discourse, ed. Eileen Fitzpatrick. Rodopi. 73-86.
    562006'Vocabulary Coverage in Spanish Textbooks: How Representative is It?' In Selected Proceedings from the Conference on the Acquisition of Spanish and Portuguese as First and Second Languages, ed. Jacqueline Toribio. Cascadilla. 132-43. (Co-authored with Timothy L. Face). 132-43.
    572006'Spoken and written register variation in Spanish: A Multi-dimensional Analysis.' Corpora 1:1-37. (Co-authored with Doug Biber, James Jones, and Nicole Tracy-Ventura).
    582005'The advantage of using relational databases for large corpora: speed, advanced queries, and unlimited annotation'. International Journal of Corpus Linguistics 10: 301-28.
    592005'On diachronic shifts with Spanish se: preliminary evidence from large electronic corpora.' In Romance Corpus Linguistics II: Corpora and Diachronic Linguistics, ed. Claus Pusch, et al. Guntar Naar. 431-42.
    602005'Vocabulary Range and Text Coverage: Insights from the Forthcoming Routledge Frequency Dictionary of Spanish'. In Selected Proceedings from the 7th Hispanic Linguistics Symposium, ed. David Eddington. 106-15.
    612005'Advanced research on syntactic and semantic change with the Corpus del Español'. In Romance Corpus Linguistics II: Corpora and Diachronic Linguistics, ed. Claus Pusch, et al. Guntar Naar. 203-14. Reprinted in: Corpus Linguistics. Critical Concepts in Linguistics (6 vols.). Ed. Teubert, Wolfgang & Ramesh Krishnamurthy. London: Routledge. 337-48 (Volume 5).
    622004'Student use of large, annotated corpora to analyze syntactic variation'. In Corpora and Language Learners, ed. Guy Aston, et al. Philadelphia: John Benjamins. 259-69.
    632004'Student use of large corpora to investigate language change'. In Applied Corpus Linguistics: A Multidimensional Perspective, ed. Thomas Upton, et al. Amsterdam: Rodopi. 207-22.
    642003'Diachronic Shifts and Register Variation with the 'Lexical Subject of Infinitive' Construction. (Para yo hacerlo)'. In Linguistic Theory and Language Development in Hispanic Languages, ed. Silvina Montrul and Francisco Ordóñez. Somerville, MA: Cascadilla Press. 13-29.
    652003'Annotation without lexicons: an alternative to the standard bootstrapping approach'. In Proceedings from Corpus Linguistics 2003, ed. Paul Rayson, et al. 174-83.
    662002'Un corpus anotado de 100.000.000 palabras del español histórico y moderno'. SEPLN 2002 (Sociedad Española para el Procesamiento del Lenguaje Natural). 21-27.
    672002'Esto es ligero de fazer: Object to Subject Raising in Medieval and Early Modern Spanish'. In Structure, Meaning, and Acquisition of Spanish, ed. James F. Lee, et al. Somerville, MA: Cascadilla Press. 19-31.
    682001'Creating and using multi-million word corpora from web-based newspapers'. In Corpus Linguistics in North America, eds. Rita C. Simpson and John M. Swales. Ann Arbor: U Michigan P. 58-75.
    692000'Using multi-million word corpora of historical and dialectal Spanish texts to teach advanced courses in Spanish linguistics'. In Rethinking Language Pedagogy from a Corpus Perspective, eds. Lou Burnard and Tony McEnery. Frankfurt am Main; New York: P. Lang. 173-85.
    702000'Syntactic Diffusion in Spanish and Portuguese Infinitival Complements'. In New Approaches to Old Problems: In Issues in Romance Historical Linguistics, eds.Steven Dworkin and Dieter Wanner. Amsterdam; Philadelphia: John Benjamins. 109-27.
    711999'The Historical Development of Subject Raising in Portuguese: A Corpus-Based Approach'. Neuphilologische Mitteilungen 100:95-110.
    721999'A Computer Corpus-Based Study of Subject Raising in Modern Portuguese'. Lingvisticae Investigationes 21:379-400.
    731998'The Evolution of Spanish Clitic Climbing: A Corpus-Based Approach.' Studia Neophilologica 69:251-63.
    741997'A Corpus-Based Approach to Diachronic Clitic Climbing in Portuguese.' Hispanic Journal 17: 93-111.
    751997'Using Large Computer-Based Corpora as a Philological Tool: An Analysis of Four Medieval Spanish Bibles.' Dactylus 16: 70-92.
    761997'The History of Subject Raising in Spanish'. Bulletin of Hispanic Studies (Liverpool) 74: 399-411.
    771997'A Corpus-Based Analysis of Subject Raising in Modern Spanish.' Hispanic Linguistics 9: 33-63.
    781996'The Diachronic Interplay of Finite and Nonfinite Verbal Complements in Spanish and Portuguese.' Bulletin of Hispanic Studies (Glasgow) 73:137-58.
    791996'The Diachronic Evolution of the Causative Construction in Portuguese.' Journal of Hispanic Philology 17:261-92.
    801995'The Evolution of Causative Constructions in Spanish and Portuguese.' In Current Research in Romance Linguistics, ed. John Amastae, et al. Philadelphia: John Benjamins, 1995. 105-122.
    811995'The Evolution of the Spanish Causative Construction.' Hispanic Review 63:57-77.
    821995'Analyzing Syntactic Variation with Computer-Based Corpora: The Case of Modern Spanish Clitic Climbing'. Hispania 78:370-380.
    831994'Parameters, Passives, and Parsing: Explaining Diachronic Shifts in Spanish and Portuguese'. In Variation and Linguistic Theory, ed. K. Beals, et al. Chicago: CLS. Vol 2. 46-60.
    841992'A Tentative Bibliography of Historical Spanish Syntax.' Hispanic Linguistics 5:279-351.
    Reviews (click to download)
    852009Review of Using Spanish Corpora (Giovanni Parodi). Modern Language Journal. 93: 467-68.
    862009Review of The International Corpus of English – British Component (ICE-GB), the Diachronic Corpus of Present-day Spoken English (DCPSE), and ICECUP 3.1. Language. 85: 443-45.
    872004Review of Léxico Hispanoamericano (Peter Boyd-Bowman, et al). La Coronica: A Journal of Medieval Spanish Literature and Language 33: 259-64.
    882004Review of Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching (Sylvaine Granger, et al). Modern Language Journal. 88: 469-70.
    892001'Review of Construcciones causativas en el español medieval (Milagros Alfonso Vega). Revista Canadiense de Estudios Hispánicos 25: 329-30.
    901995'Omnipage and WordCruncher: Tools for Creating and Searching Digitized Text Corpora.' La Corónica 23:111-115.

